Skip to main content

All Questions

0 votes
0 answers
23 views

How to choose thresholds to discretize target for binary classification

My group is using logistic regression to investigate the most predictive features in a dataset. Our target variable is actually a continuous variable that we discretized using two cutoff thresholds (...
OstensiblyPutative's user avatar
0 votes
0 answers
16 views

Can I train a logistic regression model for combining ML models to form an ensemble?

I have 3 ML models trained to perform classification on a dataset. I want to combine them into an ensemble model. I understand that there are multiple ways to do this - voting classifier, stacking, ...
Sowmya Krishnan's user avatar
0 votes
1 answer
33 views

Why does precision decrease with inceasing threshold?

I've trained a Logistic Regression model using scikit-learns LogisticRegression class. I'm dealing with stock data so it's quite noisy and difficult to predict ...
Bryan Carty's user avatar
0 votes
1 answer
71 views

ROC curve manual calculation vs. pROC package R

I want do recreate ROC curve manually on my dataset and compare it to roc function from pROC package in R. I'm using dataset on customer churn telco.csv from Kaggle....
Nikola's user avatar
  • 1
0 votes
0 answers
34 views

How to Determine the Minimum Value of a Continuous Variable for Predicting Categorical variable using Logistic Regression?

I am using logistic regression to predict df['MortSubiteCardiaque'], which contains 0 and 1, based on my continuous variable df['NTProBNP']. I would like to determine the threshold for df['NTProBNP'], ...
Mohamed kenani's user avatar
0 votes
0 answers
9 views

Error while using saved logistic regression model on scoring vector data -The columns of A don't match the number of elements of x. A: 6011, x: 232964

0 I'm getting error while using saved logistic regression model on scoring vector data. SparkException: [FAILED_EXECUTE_UDF] Failed to execute user defined function (ProbabilisticClassificationModel$$...
Kunal Sinha's user avatar
0 votes
0 answers
89 views

SMOTE-NC not working, Error: Pandas output does not support sparse data

I want to get my SMOTENC to work, but i've been failing successfully ...
user155410's user avatar
0 votes
0 answers
29 views

If my logistic regression model is performing well, does it matter if my features don't pass the Box Tidwell Test?

I've built a logistic regression model for binary classification with a high F1 score, but when I run Box-Tidwell tests on continuous independent features/predictive variables, I find non-linearities ...
systems_engineer25's user avatar
1 vote
1 answer
61 views

What should I Improve from my Neural Network Model (Logistic Regression)

Initial Information I built a Neural Network Model (Logistic Regression) to classify Lung Cancer based on the patient's (user) symptoms My dataset is kind of small (only about 276 data) Here is the ...
Jonathan's user avatar
1 vote
2 answers
947 views

Learning from aggregated data

Online and in the literature there seems to be a general consensus that training a machine learning model using aggregated data is harder and/or fundamentally different from training on raw event data....
dendog's user avatar
  • 120
0 votes
0 answers
23 views

How to estimate this variable in an MILP formulation

This is my first question being asked here. I've thought about different methods to do it, but to no avail. I want to estimate a variable that is either 0 or a positive number. Then I want to use this ...
Mohammad Rajabdorri's user avatar
1 vote
1 answer
391 views

Why does Logistic Regression perform better than machine learning models in clinical prediction studies

I am developing binary classification models to predict a medical condition in my dataset. My results show that both Logistic Regression and Linear SVM consistently outperformed other ML algorithms (...
sums22's user avatar
  • 437
0 votes
1 answer
71 views

Machine learning / statistical model of a deterministic process: how large must my training set be to ensure almost perfect accuracy?

This may be a silly question, but if I got a deterministic process, for instance, a function (in the mathematical sense) that happens to be computationally expensive to evaluate, and I decided to ...
Koto's user avatar
  • 103
0 votes
2 answers
190 views

Which intrinsically explainable model has the highest performance?

Explainable AI can be achieved through intrinsically explainable models, like logistic and linear regression, or post-hoc explanations, like SHAP. I want to use an intrinsically explainable model on ...
Connor's user avatar
  • 661
0 votes
2 answers
29 views

How will a model handle real-life values in real-life applications without scaling?

I am learning ML and facing confusion about data scaling. For example, I have the following data: Weight(KG) Balance($) 75 3401542 99 4214514 Now, if I use StandardScaler, I may get something like ...
Ishrat Hossain's user avatar

15 30 50 per page
1
2 3 4 5
17