All Questions
Tagged with model-selection cross-validation
240
questions
0
votes
0
answers
14
views
How to split data when training and tuning the meta learner in stacking?
I have a simple yet tricky conceptual question about the data splitting of a meta learning process.
Assume I have a simple X_train, ...
10
votes
1
answer
400
views
Bayesian Justification of Cross-validation
If I understand correctly, K-fold cross-validation is supposed to approximate expected log predictive density (ELPD), which is defined as $\mathop{\mathbb{E}}_{D_{new}\sim P(.|M_{true})}\log P(D_{new}|...
0
votes
0
answers
11
views
Select classification model using nested cv and bootstrap auc confidence interval
My goal is to find the best 1 model out of 55 classification models.
I first ran nested cv on 55 models to see which model had better generalization. The AUC score was used as an evaluation indicator.
...
0
votes
0
answers
33
views
I screwed-up model selection but ended-up with a very good model; am I ok?
In a recent experiment, I made an oversight: I divided my data into training and testing sets and conducted cross-validation for model selection and hyperparameter tuning after having applied Boruta (...
1
vote
0
answers
24
views
Finding optimal combination of covariates using cross validation
I have a logistic mixed model (lme4 package in R). I want to assess whether participants scores on the measures 'sumspq', 'sumpdi', and 'sumcaps' significantly affect the difference in performance ...
0
votes
0
answers
24
views
What exactly is the right approach when trying to find OOS MSE when using linear lasso regression?
This isn't a question where I have a code example to provide. It is more of an informal question about what to do between 2 options.
Assume I have some data and my goal is to fit a model using the ...
1
vote
0
answers
37
views
Hyperparameter selection after nested cross-validation and making comparisons with DeLong's test
I have already read all the associated questions on the topic but couldn't find a clear answer. I initially split my data into training (80%) and hold-out testing (20%). Then, I am performing nested ...
4
votes
0
answers
33
views
Can cross-validation be involved in model-building rather than validation?
I have a general idea in mind that would go like this:
randomly split the data into training/testing
build a model on the training data by choosing from among candidate predictors
evaluate it on the ...
0
votes
0
answers
5
views
Comparing two algorithms, one is parameter free while the other is not
I wish to compare two algorithm for subspace approximation (similar to PCA). One algorithm is parameter free, while the other is not. I use cross validation to set value to this parameter, and then ...
1
vote
1
answer
146
views
Cross-validation and model selection of ANN
I have a neural network that I use to classify data into a number of classes; in my particular case, the classes are imbalanced, but I am trying to understand this for the general case. I am using F1 ...
0
votes
1
answer
147
views
How to avoid bias/avoid overfitting when choosing a machine learning model? [closed]
My typical workflow in the past, when creating machine learning models, has been to do the following:
Decide on some candidate model families for the task at hand.
Divide dataset into train and test ...
4
votes
2
answers
195
views
When does model selection begin to overfit?
Suppose you have a small dataset (perhaps 1000 labels), and you are using cross-validation to train different models and to choose the best one (according to their cross-validation scores).
It seems ...
5
votes
3
answers
1k
views
How does cross validation works for feature selection (using stepwise regression)?
I have used the MATLAB regression learner application to do some stepwise regression with a 10-fold cross validation for feature selection. But now I want to code it myself and I'm confused about the ...
4
votes
1
answer
218
views
How to select the best performing model when using nested cross-validation?
I am having some doubts about understanding nested cross-validation. I'm conducting research with a small dataset and would like to get the nested cross-validation design correctly. My dataset is $50\...
1
vote
0
answers
54
views
In search of parsimony...Can/Should Information Criterion be used as Cost Functions in the Hyperparameter Tuning of Regularization Models?
When tuning regularized models, two techniques appear to be especially popular at the moment:
Cross-validation performance on train & validation splits (the third, test/holdout set is not used in ...