Skip to main content

All Questions

0 votes
0 answers
11 views

Select classification model using nested cv and bootstrap auc confidence interval

My goal is to find the best 1 model out of 55 classification models. I first ran nested cv on 55 models to see which model had better generalization. The AUC score was used as an evaluation indicator. ...
JAE's user avatar
  • 89
1 vote
1 answer
146 views

Cross-validation and model selection of ANN

I have a neural network that I use to classify data into a number of classes; in my particular case, the classes are imbalanced, but I am trying to understand this for the general case. I am using F1 ...
nico's user avatar
  • 4,601
1 vote
0 answers
25 views

What statistical test to use when comparing classifier performance for original dataset and new dataset with org + 1 variable

I would like to compare the classifier performance when there is a newly added variable to the dataset. Say the original prediction was with 10 input variables. The new one is with 11 inputs. My ...
mezbaha's user avatar
  • 11
0 votes
0 answers
25 views

Outputs of model evaluation function

I am working on a 3-class ML problem and I am in the phase where I have to write an "evaluation" function for my training dataset (i.e. function that performs a cross-validation method to ...
vggls's user avatar
  • 113
0 votes
1 answer
29 views

Dimension selection based on test accuracy

consider that I have a dataset with train, validation and test set and I want to train the pipeline PCA+logistic regression classifier. So far, for a specific k (that is the reduced data dimension ...
Leo Dust's user avatar
1 vote
0 answers
202 views

Split one model into two models for a classification problem

In the classification problem, I am working on right now, I have to classify transactions mainly with text data. The classes of the training set can easily be divided into class sets with a negative ...
Jul'i's user avatar
  • 11
5 votes
2 answers
1k views

MultiClass Classification - Training OvO and OvA

I like to know how OvO (One vs One) and OvA (One vs All) models are trained in multiclass classification problem. To keep it simple, we have 4 classes, each of which has 1000 datapoints. What are the ...
Habib Karbasian's user avatar
0 votes
2 answers
298 views

Comparing classifier performance when using slightly different datasets

Let's say I'm trying to predict whether tomorrow's temperature is higher than today's based on historical data (2 time series A and B). I've chosen XGBoost for the task. For model selection (...
J.Dow's user avatar
  • 11
2 votes
0 answers
73 views

Error metric to compare ratios derived from a binary prediction task

I'm working on a research problem where a binary classification task ultimately produces a ratio downstream. I would like to understand the best way to quantitatively compare the resultant ratio to ...
Andrew Brown's user avatar
3 votes
2 answers
400 views

Models with low variance but high bias

If we have a classification/regression problem, when would we generally prefer to use families of models with high bias and low variance like multiple regression (logistic regression for ...
Math_cat's user avatar
1 vote
1 answer
180 views

knowing which predictors are significant in a logistic regression model

I am trying to make a logistic regression model based on 5 predictors: 2 of these are categorical and 3 are numerical. The output is simply 1 or 0, and upon performing the Matlab function glmfit(x,y,'...
cgo's user avatar
  • 9,217
1 vote
0 answers
73 views

Which ml model for selecting one candidate among many

I have an entity matching task and I am struggling to decide which ml model I should use for it. Let me break it down. I have a complex search query, and for each query I can have from 0 to ~50 ...
ivanibash's user avatar
  • 175
1 vote
1 answer
35 views

What are the important Factors for Feature Selection in Classification Problems? [closed]

While doing a classification I have to choose from the ocean of choices at every step like model selection, performance criteria selection and all. But the important two things I get confused most of ...
Deshwal's user avatar
  • 244
0 votes
0 answers
47 views

Getting the distribution of a model accuracy metric using the Posterior of the Model Parameters

Say I compute the posterior via MCMC of a classification model's hyper-parameters $\theta$ given my observed data: $\pi(\theta|D)$. Would it be at all useful to take a look at, for example, the ...
tuner's user avatar
  • 1
1 vote
2 answers
162 views

multiple hypothesis tests for features selection (classification)

I am wondering whether running multiple hypothesis tests (t-test / Mann Whitney) as a first step in classification problem. Specifically: given a data set with k features (k=3 in the example bellow),...
Arnold Klein's user avatar

15 30 50 per page