Newest 'model-selection+overfitting' Questions

1 vote

1 answer

49 views

Estimate number of covariates in Cox regression model

My doubt about overfitting is almost general, but in this particular case is all about survival models. I am working in a case-cohort study, estimating the HR in a cohort where heart attack correspond ...

Javier Hernando

275

asked Apr 25 at 10:19

1 vote

0 answers

90 views

Model calibration in overfitted models

Why in Shrinkage, due to an overfitted prediction model, do we tend to overestimate risk for "high risk" subjects and to underestimate risk for "low risk" subjects ? Intuitively I ...

vixxovs

45

asked Jun 18, 2023 at 13:40

4 votes

2 answers

195 views

When does model selection begin to overfit?

Suppose you have a small dataset (perhaps 1000 labels), and you are using cross-validation to train different models and to choose the best one (according to their cross-validation scores). It seems ...

MWB

1,337

asked May 16, 2023 at 20:47

0 votes

2 answers

98 views

How to estimate the probability of LOOCV error of one model to be better then LOOCV error of the correct model?

Lets consider a simple regression problem in which we have only one real-valued feature and one real valued-target. We try to fit the data using a polynomial function. We also try to use the given ...

Roman

612

asked Jan 4, 2022 at 10:22

2 votes

1 answer

34 views

Model selection in presence of overfitting - better test or closer train

Suppose I have a tree-based model (Random Forest for the sake of the example) and I play with a regularization parameter (tree depth) to fight overfitting. Eventually I can come up with two models - ...

Dimgold

318

asked Jan 4, 2022 at 9:16

0 votes

1 answer

193 views

Which if these two models works better?

I have this time series I want to perform polynomial regression on, to estimate the trend. To start, I tried using only a second order polynomial, these are the results (AIC=30.37105) We can see how ...

Marco Rudelli

590

asked Oct 15, 2021 at 16:26

3 votes

1 answer

1k views

Selecting p,q,d for ARIMA and overfitting. Shouldn't the parameters be tuned on a training set?

I have seen multiple tutorials [example link] for ARIMA where they select the p,q,d parameters for it based on the whole time series. Then, after deciding on the model parameters they want to use, ...

MattSt

350

asked Oct 5, 2021 at 10:36

20 votes

4 answers

4k views

Why does the Akaike Information Criterion (AIC) sometimes favor an overfitted model?

As an exercise to develop practical experience working with model selection criteria, I computed fits of the highway mpg vs. engine displacement data from the tidyverse mpg example data set using ...

stachyra

3,102

asked May 14, 2021 at 19:55

3 votes

2 answers

345 views

What is accepted practice for avoiding optimistic bias when selecting a model family after hyperparameter tuning?

This is an extension of a previous question: How to avoid overfitting bias when both hyperparameter tuning and model selecting? ...which provided some options for the question at hand, but now I would ...

Josh

308

asked Nov 16, 2020 at 22:01

10 votes

2 answers

4k views

How to avoid overfitting bias when both hyperparameter tuning and model selecting?

Say I have 4 or more algorithm types (logistic, random forest, neural net, svm, etc) each of which I want to try out on my dataset, and each of which I need to tune hyperparameters on. I would ...

Josh

308

asked Nov 3, 2020 at 14:24

5 votes

3 answers

722 views

How to choose between an overfit model and a non-overfit model?

I often encounter this situation in modeling. Suppose I build two classification models. Below is their performance: Model 1: training accuracy: 0.80, test accuracy: 0.50 Model 2: training accuracy: 0....

etang

1,007

asked Oct 31, 2020 at 5:07

1 vote

0 answers

279 views

How can one use Grid Search without overfitting the model?

I checked several questions, like Overfitting during model selection - AutoML vs Grid search and Hyperparameter tuning using grid search/randomised search, but I don't think any of them answer my ...

dmmmmd

23

asked Oct 7, 2020 at 12:02

2 votes

1 answer

247 views

Can I still use an overfitted model with high test accuracy?

Below is the training statistics output from training a Keras/TF model. You can see val_accuracy peaks at Epoch 4 with 0.6633. After that accuracy(train) continues to go up but val_accuracy becomes ...

etang

1,007

asked Aug 29, 2020 at 17:07

1 vote

2 answers

260 views

When is it okay to make changes to your model after validating?

Let’s say I’m building a model to predict cancer relapse for a scientific paper. I use my training set to build many models and validate the best one on my test set to get an AUC of 0.65. I then go ...

Daniel Freeman

235

asked May 27, 2020 at 3:22

3 votes

1 answer

95 views

Overfitting through model selection

I'm asking this question as I found little explanation of this phenomenon otherwhere. I am wondering about how to best deal with overfitting that comes from the model selection itself. Say I want to ...

Adrian Constantin Penz

31

asked Apr 2, 2020 at 15:05

Stack Exchange Network

All Questions

Estimate number of covariates in Cox regression model

Model calibration in overfitted models

When does model selection begin to overfit?

How to estimate the probability of LOOCV error of one model to be better then LOOCV error of the correct model?

Model selection in presence of overfitting - better test or closer train

Which if these two models works better?

Selecting p,q,d for ARIMA and overfitting. Shouldn't the parameters be tuned on a training set?

Why does the Akaike Information Criterion (AIC) sometimes favor an overfitted model?

What is accepted practice for avoiding optimistic bias when selecting a model family after hyperparameter tuning?

How to avoid overfitting bias when both hyperparameter tuning and model selecting?

How to choose between an overfit model and a non-overfit model?

How can one use Grid Search without overfitting the model?

Can I still use an overfitted model with high test accuracy?

When is it okay to make changes to your model after validating?

Overfitting through model selection

Hot Network Questions

All Questions

Related Tags