Frequent 'classification' Questions - Data Science Stack Exchange

265 votes

10 answers

435k views

How to set class weights for imbalanced classes in Keras?

I know that there is a possibility in Keras with the class_weights parameter dictionary at fitting, but I couldn't find any example. Would somebody so kind to ...

Hendrik

8,677

asked Aug 17, 2016 at 9:35

42 votes

6 answers

54k views

Unbalanced multiclass data with XGBoost

I have 3 classes with this distribution: Class 0: 0.1169 Class 1: 0.7668 Class 2: 0.1163 And I am using xgboost for ...

shda

585

asked Jan 16, 2017 at 12:53

35 votes

4 answers

16k views

Quick guide into training highly imbalanced data sets

I have a classification problem with approximately 1000 positive and 10000 negative samples in training set. So this data set is quite unbalanced. Plain random forest is just trying to mark all test ...

IgorS

5,474

asked Sep 12, 2014 at 15:20

31 votes

1 answer

33k views

How is a splitting point chosen for continuous variables in decision trees?

I have two questions related to decision trees: If we have a continuous attribute, how do we choose the splitting value? Example: Age=(20,29,50,40....) Imagine that we have a continuous attribute $f$...

WALID BELRHALMIA

421

asked Nov 3, 2017 at 21:45

15 votes

2 answers

711 views

Why does data science see class imbalance as a problem for supervised learning when statistics does not?

Why does data science see class imbalance as a problem in supervised learning when statistics says it is not? Data science seems to seem class imbalance as problematic and needing special techniques ...

Dave

3,979

asked Jan 9 at 18:34

2 votes

1 answer

741 views

Can a decision in a node of a decision tree be based on comparison between 2 columns of the dataset?

Assume the features in the dataframe are columns - A,B,C and my target is Y Can my decision tree have a decision node which looks for say, ...

Jerry

43

asked Sep 26, 2019 at 17:55

12 votes

1 answer

5k views

Using a pre trained CNN classifier and apply it on a different image dataset

How would you optimize a pre-trained neural network to apply it to a separate problem? Would you just add more layers to the pre-trained model and test it on your ...

Sid

677

asked Feb 27, 2018 at 23:10

10 votes

1 answer

4k views

Can The linearly non-separable data be learned using polynomial features with logistic regression?

I know that Polynomial Logistic Regression can easily learn a typical data like the following image: I was wondering whether the following two data also can be ...

Green Falcon

14.1k

asked Aug 2, 2017 at 10:47

4 votes

2 answers

6k views

Imbalanced Dataset: Train/test split before and after SMOTE

This question is similar but different from my previous one. I have a binary classification task related to customer churn for a bank. The dataset contains 10,000 instances and 11 features. The target ...

KK_o7

67

asked Nov 24, 2021 at 9:06

16 votes

2 answers

37k views

How to calculate VC-dimension?

Im studying machine learning, and I would like to know how to calculate VC-dimension. For example: $h(x)=\begin{cases} 1 &\mbox{if } a\leq x \leq b \\ 0 & \mbox{else } \end{cases} $, with ...

铭声孙

173

asked Jan 6, 2017 at 10:23

7 votes

2 answers

2k views

Doesn't over(/under)sampling an imbalanced dataset cause issues?

I'm reading a lot about how to use different metrics specifically for imbalanced datasets (e.g. two classes present, but 80% of the data is one class) and how to tackle the issue of imbalanced ...

lte__

1,350

asked Apr 29, 2021 at 13:59

3 votes

1 answer

132 views

Class imbalance strategies

When dealing with the class imbalance problem in a binary classifier, there are three ways I know of to address it: over-sampling, under-sampling and using cost-sensitive methods. Are there any ...

David Masip

6,101

asked Jun 15, 2018 at 7:29

3 votes

4 answers

808 views

What is the difference between classification and regression?

I understand classification....a discrete response or category, like animal is dog or cat. The author says..."Regression techniques predict continuous changes such as the change in temperature, power ...

Martin Muldoon

159

asked Nov 27, 2018 at 13:00

3 votes

1 answer

589 views

When should I oversample data?

I am dealing with multi-class classifiers. My data is unbalanced. Hence, I need to apply sampling techniques before training (undersampling or oversampling). When I apply undersampling, ...

Kyv

151

asked Sep 7, 2021 at 18:51

2 votes

2 answers

2k views

Explain Binary Classification with output 0.5 (True)

What is the interpretation of output 0.5 of a typical classifier? I made a prediction and the probability of that data point being from the True class is 0.5.

Abhishek Sharma

339

asked Dec 17, 2017 at 13:49

Stack Exchange Network

Questions tagged [classification]

How to set class weights for imbalanced classes in Keras?

Unbalanced multiclass data with XGBoost

Quick guide into training highly imbalanced data sets

How is a splitting point chosen for continuous variables in decision trees?

Why does data science see class imbalance as a problem for supervised learning when statistics does not?

Can a decision in a node of a decision tree be based on comparison between 2 columns of the dataset?

Using a pre trained CNN classifier and apply it on a different image dataset

Can The linearly non-separable data be learned using polynomial features with logistic regression?

Imbalanced Dataset: Train/test split before and after SMOTE

How to calculate VC-dimension?

Doesn't over(/under)sampling an imbalanced dataset cause issues?

Class imbalance strategies

What is the difference between classification and regression?

When should I oversample data?

Explain Binary Classification with output 0.5 (True)

Hot Network Questions

Questions tagged [classification]

Related Tags