Skip to main content

All Questions

0 votes
0 answers
14 views

How to split data when training and tuning the meta learner in stacking?

I have a simple yet tricky conceptual question about the data splitting of a meta learning process. Assume I have a simple X_train, ...
Yann's user avatar
  • 43
10 votes
1 answer
400 views

Bayesian Justification of Cross-validation

If I understand correctly, K-fold cross-validation is supposed to approximate expected log predictive density (ELPD), which is defined as $\mathop{\mathbb{E}}_{D_{new}\sim P(.|M_{true})}\log P(D_{new}|...
Feri's user avatar
  • 197
0 votes
0 answers
11 views

Select classification model using nested cv and bootstrap auc confidence interval

My goal is to find the best 1 model out of 55 classification models. I first ran nested cv on 55 models to see which model had better generalization. The AUC score was used as an evaluation indicator. ...
JAE's user avatar
  • 89
0 votes
0 answers
33 views

I screwed-up model selection but ended-up with a very good model; am I ok?

In a recent experiment, I made an oversight: I divided my data into training and testing sets and conducted cross-validation for model selection and hyperparameter tuning after having applied Boruta (...
Alek Fröhlich's user avatar
1 vote
0 answers
24 views

Finding optimal combination of covariates using cross validation

I have a logistic mixed model (lme4 package in R). I want to assess whether participants scores on the measures 'sumspq', 'sumpdi', and 'sumcaps' significantly affect the difference in performance ...
SilvaC's user avatar
  • 512
0 votes
0 answers
24 views

What exactly is the right approach when trying to find OOS MSE when using linear lasso regression?

This isn't a question where I have a code example to provide. It is more of an informal question about what to do between 2 options. Assume I have some data and my goal is to fit a model using the ...
Donk's user avatar
  • 1
1 vote
0 answers
37 views

Hyperparameter selection after nested cross-validation and making comparisons with DeLong's test

I have already read all the associated questions on the topic but couldn't find a clear answer. I initially split my data into training (80%) and hold-out testing (20%). Then, I am performing nested ...
user22409235's user avatar
4 votes
0 answers
33 views

Can cross-validation be involved in model-building rather than validation?

I have a general idea in mind that would go like this: randomly split the data into training/testing build a model on the training data by choosing from among candidate predictors evaluate it on the ...
Dave's user avatar
  • 2,651
0 votes
0 answers
5 views

Comparing two algorithms, one is parameter free while the other is not

I wish to compare two algorithm for subspace approximation (similar to PCA). One algorithm is parameter free, while the other is not. I use cross validation to set value to this parameter, and then ...
Roy's user avatar
  • 839
1 vote
1 answer
146 views

Cross-validation and model selection of ANN

I have a neural network that I use to classify data into a number of classes; in my particular case, the classes are imbalanced, but I am trying to understand this for the general case. I am using F1 ...
nico's user avatar
  • 4,601
0 votes
1 answer
147 views

How to avoid bias/avoid overfitting when choosing a machine learning model? [closed]

My typical workflow in the past, when creating machine learning models, has been to do the following: Decide on some candidate model families for the task at hand. Divide dataset into train and test ...
user avatar
4 votes
2 answers
195 views

When does model selection begin to overfit?

Suppose you have a small dataset (perhaps 1000 labels), and you are using cross-validation to train different models and to choose the best one (according to their cross-validation scores). It seems ...
MWB's user avatar
  • 1,337
5 votes
3 answers
1k views

How does cross validation works for feature selection (using stepwise regression)?

I have used the MATLAB regression learner application to do some stepwise regression with a 10-fold cross validation for feature selection. But now I want to code it myself and I'm confused about the ...
Azarang's user avatar
  • 59
4 votes
1 answer
218 views

How to select the best performing model when using nested cross-validation?

I am having some doubts about understanding nested cross-validation. I'm conducting research with a small dataset and would like to get the nested cross-validation design correctly. My dataset is $50\...
masto12's user avatar
  • 41
1 vote
0 answers
54 views

In search of parsimony...Can/Should Information Criterion be used as Cost Functions in the Hyperparameter Tuning of Regularization Models?

When tuning regularized models, two techniques appear to be especially popular at the moment: Cross-validation performance on train & validation splits (the third, test/holdout set is not used in ...
FiddleBat's user avatar

15 30 50 per page
1
2 3 4 5
16