All Questions
Tagged with classification logistic-regression
121
questions
0
votes
0
answers
23
views
How to choose thresholds to discretize target for binary classification
My group is using logistic regression to investigate the most predictive features in a dataset. Our target variable is actually a continuous variable that we discretized using two cutoff thresholds (...
2
votes
0
answers
53
views
Why is cross-entropy increasing with accuracy?
I'm making an implementation of the softmax regression and I'm struggling to understand the nature behind the problem of increasing value of Cross-Entropy: $H(y_i, p_i)=-\sum_{i=1}^C y_i log(p_i)$, ...
0
votes
0
answers
76
views
PySpark Logistic regression model weights are inconsistent between runs
I am training a pyspark logistic regression model using pyspark mllib. I am noticing that the weights are not being consistent in between runs. I have set the random seed in the training script and ...
0
votes
0
answers
23
views
How to estimate this variable in an MILP formulation
This is my first question being asked here. I've thought about different methods to do it, but to no avail. I want to estimate a variable that is either 0 or a positive number. Then I want to use this ...
0
votes
1
answer
2k
views
How to calculate accuracy of a logistic regression?
A logistic regression involves a linear combination of features to predict the log-odds of a binary, yes/no-style event. That log-odds can then be transformed to a probability. If $\hat L_i$ is the ...
1
vote
1
answer
165
views
Probability distribution of probabilities
We can get the prediction probabilities of a binary classifier from sklearn's API using the predict_proba method. Is it reasonable to expect that the shape of a histogram plotted for the prediction ...
0
votes
1
answer
130
views
Quasi complete separation problem
I have some question related to quasi complete seperation problem on logistic regression algorithm.
So i run the model to predict credit risk and turns out it gave me good prediction score (AUC around ...
6
votes
1
answer
263
views
Logistic Regression Modeling & Interpretation [closed]
I'm building a logistic regression model to predict the credit risk of lending company customers.
I'm using dataset from kaggle : https://www.kaggle.com/datasets/ranadeep/credit-risk-dataset/code
...
0
votes
1
answer
52
views
Can I use clustering after classification to improve the performance of my classifier?
Say I have a classifier that segments my feature vectors (e.g. representing applicants) into 3 distinct segments A, B, C by assigning each applicant a score between 0 (worst) and 1 (best) with e.g. a ...
0
votes
3
answers
961
views
Tweak machine learning algorithm in SciKit to optimize for recall
I am given a dataset to detect fraud. Something similar like this:
https://www.kaggle.com/code/imgremlin/4th-place-in-fraud-detection-from-zindi
The issue with SciKit machine learning algorithm is ...
0
votes
1
answer
106
views
How to find the optimal cut-off point to minimize both the FNR and FPR in R?
I should find the optimal threshold to minimize both the false positive rate and false negative rate. An equal weight between these two rates should be assumed. I write the following code:
...
8
votes
1
answer
213
views
Examples where simple classifier systems out-perform deep learning
I have been working on a problem where published results using deep learning are substantially worse than results I have obtained on the same task (using the same experimental protocol) using simple ...
0
votes
0
answers
29
views
Where do I draw the line at unbalanced datasets?
I have a problem where I am to construct a classification variable Yes/No based on another feature's value. We are interested in the Yes class in this case. I am told to use 10-fold cross validation.
...
0
votes
1
answer
116
views
Spot Logistic Regression Training Error
My friend gave me this puzzle awhile ago and I've never figured it out.
...
0
votes
1
answer
46
views
Predict data using Pre-Trained Classification Model
I have pre trained classification model (saved as pickle file) to predict employee attrition.
My question is when I use new dataset to predict using Pickle file do I need do all preprocessing steps (...