Questions tagged [cross-validation]

Ask Question

Refers to general procedures that attempt to determine the generalizability of a statistical result. Cross-validation arises frequently in the context of assessing how a particular model fit predicts future observations. Methods for cross-validation usually involve withholding a random subset of the data during model fitting and quantifying how accurate the withheld data are predicted and repeating this process to get a measure of prediction accuracy.

645 questions

0 votes

1 answer

20 views

Sklearn EstimatorCV vs GridSearchCV

sklearn has the following description for EstimatorCV estimators: https://scikit-learn.org/stable/glossary.html#term-cross-validation-estimator An estimator that has built-in cross-validation ...

wannabedatascientist

asked Jul 2 at 3:02

0 votes

0 answers

23 views

How to choose thresholds to discretize target for binary classification

My group is using logistic regression to investigate the most predictive features in a dataset. Our target variable is actually a continuous variable that we discretized using two cutoff thresholds (...

OstensiblyPutative

asked Jun 21 at 21:25

0 votes

0 answers

15 views

How to Combine Cross-Validation Error and Ensemble Prediction Variance in Machine Learning?

I am working on a machine learning project where I use an ensemble model (Random Forest) and I want to accurately represent the prediction uncertainty. Specifically, I want to combine the cross-...

x H

asked May 21 at 0:33

0 votes

1 answer

15 views

Averaging model performance across n-fold cross validation: MSE or R^2?

I'm comparing the performance of several models on the same data using cross-validation (holding out 1/n of the data as a test set, fitting the model on the remaining data, testing on the test set). I ...

Leo Selker

asked May 16 at 4:19

2 votes

1 answer

29 views

Does it make sense that the performance of XG Boost varies dramatically from two machines holding all hyperparameters fixed?

I am hyperparameter tuning an xgboost model and I am finding that depending if I train the model locally on my machine vs on AWS sagemaker, I get quite different results. Running cross-validation ...

Luca Guarro

asked May 16 at 2:12

0 votes

1 answer

82 views

Test Error is extremely higher than Training error after gridsearch and crossvalidation

I'm currently working on a machine learning project. It's a supervised learning problem. My goal is to predict for given data of an animal(keeping,size,weight,...) ingredients(energy,vitamine etc..). ...

Marco Cotrotzo

asked May 13 at 7:54

1 vote

1 answer

19 views

Scoring function in cross-validation often left default

I'm a PhD student applying ML in microbiology. In research papers, the usual performance measure reported on classification models is ROC-AUC. But when I look at implementations, the scoring function ...

alepfu

asked May 2 at 18:06

0 votes

1 answer

34 views

How do I identify overffiting when using GridSchearCV?

For context, I'm using Scikit Learn's GridSearchCV to find the best Hyperparameters of a Decision Tree. I believe I understand Train, Validation, and Test sets and overfitting concepts when applied ...

Lisana Daniel

asked May 1 at 17:09

0 votes

0 answers

18 views

How to use cross validation to select/evaluate model with probability score as the output?

Initially I was evaluating my models using cross_val with out-of-pocket metrics such as precision, recall, f1 score, etc, or with my own metrics defined in ...

szheng

asked Apr 29 at 19:58

0 votes

0 answers

14 views

Right Cross Validation Implementation (Regression)

I am very new to machine learning and i am starting to work my way up. I have made an implementation for cross validation which will be used with ensemble models later. I have made a pipeline in ...

Guhan

asked Apr 26 at 18:30

1 vote

1 answer

14 views

Model evaluation approach allowing manual experimentation without data leakage

In supervised machine learning, are there any evaluation approaches beside using a fixed holdout test dataset, which allow me as a scientist to manually compare preprocessing approaches, without ...

thomas8wp

asked Apr 22 at 15:06

0 votes

1 answer

21 views

Cross validation

I do not get why in For cross validation should I use training set, or whole dataset? the responses say that cross validation must be done exclusively on training set. Doesn't the methods (for example ...

Curious student

asked Apr 17 at 9:26

1 vote

0 answers

7 views

Is GroupKFold needed if some samples have some of their feature values equal?

I am given a dataset $D$ of 10k enzyme-substrate complexes having a lock-key relationship, with each sample (complex) being characterized by enzyme features $x_e$ and substrate features $x_s$. That is,...

ado sar

asked Apr 13 at 22:17

0 votes

0 answers

13 views

How does hyperparameter tuning work for constructing/choosing a final model using Nested Cross validation?

I want to determine if XGBoost is better than random forest or logistic regression for building a binary classification model. The model will be a composite model, with a feature selection model to ...

reuben george

asked Apr 10 at 2:49

0 votes

0 answers

23 views

If I do cross validation do I need to refit the model?

I am making a dual process. I have an initial dataset in which I train (fit) a model, then I do cross validation to get results. Until now everything normal, but additional to that, I create a new ...

Curious student

asked Apr 3 at 14:56

15 30 50 per page

2 3 4 5

…

43 Next

Stack Exchange Network

Questions tagged [cross-validation]

Sklearn EstimatorCV vs GridSearchCV

How to choose thresholds to discretize target for binary classification

How to Combine Cross-Validation Error and Ensemble Prediction Variance in Machine Learning?

Averaging model performance across n-fold cross validation: MSE or R^2?

Does it make sense that the performance of XG Boost varies dramatically from two machines holding all hyperparameters fixed?

Test Error is extremely higher than Training error after gridsearch and crossvalidation

Scoring function in cross-validation often left default

How do I identify overffiting when using GridSchearCV?

How to use cross validation to select/evaluate model with probability score as the output?

Right Cross Validation Implementation (Regression)

Model evaluation approach allowing manual experimentation without data leakage

Cross validation

Is GroupKFold needed if some samples have some of their feature values equal?

How does hyperparameter tuning work for constructing/choosing a final model using Nested Cross validation?

If I do cross validation do I need to refit the model?

Hot Network Questions

Questions tagged [cross-validation]

Related Tags