Skip to main content

Questions tagged [classification]

An instance of supervised learning that identifies the category or categories which a new instance of dataset belongs.

265 votes
10 answers
435k views

How to set class weights for imbalanced classes in Keras?

I know that there is a possibility in Keras with the class_weights parameter dictionary at fitting, but I couldn't find any example. Would somebody so kind to ...
Hendrik's user avatar
  • 8,677
42 votes
6 answers
54k views

Unbalanced multiclass data with XGBoost

I have 3 classes with this distribution: Class 0: 0.1169 Class 1: 0.7668 Class 2: 0.1163 And I am using xgboost for ...
shda's user avatar
  • 585
35 votes
4 answers
16k views

Quick guide into training highly imbalanced data sets

I have a classification problem with approximately 1000 positive and 10000 negative samples in training set. So this data set is quite unbalanced. Plain random forest is just trying to mark all test ...
IgorS's user avatar
  • 5,474
31 votes
1 answer
33k views

How is a splitting point chosen for continuous variables in decision trees?

I have two questions related to decision trees: If we have a continuous attribute, how do we choose the splitting value? Example: Age=(20,29,50,40....) Imagine that we have a continuous attribute $f$...
WALID BELRHALMIA's user avatar
15 votes
2 answers
711 views

Why does data science see class imbalance as a problem for supervised learning when statistics does not?

Why does data science see class imbalance as a problem in supervised learning when statistics says it is not? Data science seems to seem class imbalance as problematic and needing special techniques ...
Dave's user avatar
  • 3,979
2 votes
1 answer
741 views

Can a decision in a node of a decision tree be based on comparison between 2 columns of the dataset?

Assume the features in the dataframe are columns - A,B,C and my target is Y Can my decision tree have a decision node which looks for say, ...
Jerry's user avatar
  • 43
12 votes
1 answer
5k views

Using a pre trained CNN classifier and apply it on a different image dataset

How would you optimize a pre-trained neural network to apply it to a separate problem? Would you just add more layers to the pre-trained model and test it on your ...
Sid's user avatar
  • 677
10 votes
1 answer
4k views

Can The linearly non-separable data be learned using polynomial features with logistic regression?

I know that Polynomial Logistic Regression can easily learn a typical data like the following image: I was wondering whether the following two data also can be ...
Green Falcon's user avatar
  • 14.1k
4 votes
2 answers
6k views

Imbalanced Dataset: Train/test split before and after SMOTE

This question is similar but different from my previous one. I have a binary classification task related to customer churn for a bank. The dataset contains 10,000 instances and 11 features. The target ...
KK_o7's user avatar
  • 67
16 votes
2 answers
37k views

How to calculate VC-dimension?

Im studying machine learning, and I would like to know how to calculate VC-dimension. For example: $h(x)=\begin{cases} 1 &\mbox{if } a\leq x \leq b \\ 0 & \mbox{else } \end{cases} $, with ...
铭声孙's user avatar
  • 173
7 votes
2 answers
2k views

Doesn't over(/under)sampling an imbalanced dataset cause issues?

I'm reading a lot about how to use different metrics specifically for imbalanced datasets (e.g. two classes present, but 80% of the data is one class) and how to tackle the issue of imbalanced ...
lte__'s user avatar
  • 1,350
3 votes
1 answer
132 views

Class imbalance strategies

When dealing with the class imbalance problem in a binary classifier, there are three ways I know of to address it: over-sampling, under-sampling and using cost-sensitive methods. Are there any ...
David Masip's user avatar
  • 6,101
3 votes
4 answers
808 views

What is the difference between classification and regression?

I understand classification....a discrete response or category, like animal is dog or cat. The author says..."Regression techniques predict continuous changes such as the change in temperature, power ...
Martin Muldoon's user avatar
3 votes
1 answer
589 views

When should I oversample data?

I am dealing with multi-class classifiers. My data is unbalanced. Hence, I need to apply sampling techniques before training (undersampling or oversampling). When I apply undersampling, ...
Kyv's user avatar
  • 151
2 votes
2 answers
2k views

Explain Binary Classification with output 0.5 (True)

What is the interpretation of output 0.5 of a typical classifier? I made a prediction and the probability of that data point being from the True class is 0.5.
Abhishek Sharma's user avatar

15 30 50 per page
1
2 3 4 5
14