All Questions
Tagged with model-selection machine-learning
172
questions
0
votes
0
answers
14
views
How to split data when training and tuning the meta learner in stacking?
I have a simple yet tricky conceptual question about the data splitting of a meta learning process.
Assume I have a simple X_train, ...
0
votes
0
answers
16
views
Select the most general machine learning model
For example, let's say that model A had an average train auc of 0.82 and a test auc of 0.79 through cross-validation. The difference between the two scores is 0.03.
Let's say that model B has a train ...
0
votes
0
answers
33
views
I screwed-up model selection but ended-up with a very good model; am I ok?
In a recent experiment, I made an oversight: I divided my data into training and testing sets and conducted cross-validation for model selection and hyperparameter tuning after having applied Boruta (...
1
vote
0
answers
13
views
Model choice based on test/train/validation split [duplicate]
My question is very simple, but no matter where I look it up, it seems that I get another answer.
Take a simple classification task. Let's say I trained a kNN, LDA and logistic regression on it for ...
0
votes
0
answers
26
views
How to fit a dataset like this, and what's the recommended evaluate metrics for it
the dataset seems like non-linear,
is there any recommended way to fit the datatset? since it's a non-linear regression problem, what's the correct way to evaluate the model's prediction? is the MSE ...
11
votes
7
answers
3k
views
Why do we use Linear Models when tree based models often work better than linear models?
In Supervised Machine Learning, and specifically on Kaggle, it is usually seen that tree models often outperform linear models. And even in the tree-based models, it is usually XGBoost that ...
1
vote
0
answers
37
views
Hyperparameter selection after nested cross-validation and making comparisons with DeLong's test
I have already read all the associated questions on the topic but couldn't find a clear answer. I initially split my data into training (80%) and hold-out testing (20%). Then, I am performing nested ...
1
vote
1
answer
201
views
Which regression model would you choose?
Which regression model would you choose to model the following flood damage data? The variables are x1=water height, x2=dike height and x3=flood damage. The following plot shows how the flood damages ...
1
vote
1
answer
37
views
train / validation / test split problem
Suppose that I have created train/validation/test splits for model building.
I optimized the hyperparameters using the validation set and chose the parameter values which gave the highest accuracy. To ...
1
vote
0
answers
30
views
Best Strategy for Model Training & Selection (Spoiler: Should I Re-Train?)
After a discussion with some colleagues, I've realized we've different views on which is the go-to strategy for model training.
Strategy A: Train-Validation-Test Split and Final Model Selection
...
0
votes
0
answers
41
views
How to test for significance of differences between metrics for two models? (Machine learning model selection)
Problem - I want to test whether the difference in a metric (say AUC) between two models is significant. I have one vector of binary class predictions from a custom function and one from sklearn....
4
votes
2
answers
195
views
When does model selection begin to overfit?
Suppose you have a small dataset (perhaps 1000 labels), and you are using cross-validation to train different models and to choose the best one (according to their cross-validation scores).
It seems ...
2
votes
0
answers
85
views
Is AIC scale invariant for problems concerning the number of data points in regression?
I am trying to use Akaike Information Criterion with the small sample correction (AICc) as method for determining how many data points to use in a linear approximation of a non-linear function; the ...
0
votes
0
answers
53
views
ISLR Chapter 6 : Choosing the Optimal Model
I had a question regarding the "choosing the optimal model" section of chapter 6 of ISLR (pg. 232). The book states that
"In order to select the best model with respect to test error, ...
0
votes
1
answer
383
views
How to select a model based on ROC AUC, sensitivity and specificity?
I'm running several machine learning algorithms on a dataset with 80% negatives and 20% positive cases (classification). Below I attach the results of comparing performance on 500 bootstrap resamples ...