Newest 'one-hot-encoding' Questions - Data Science Stack Exchange

0 votes

1 answer

23 views

How to create consistent dummy variables in Inference code?

I am using pd.get_dummies on a categorical column to create dummy variables. The Training pipeline is something like this Normalization Dummy variable Creation ...

Sociopath

1,253

asked Jun 28 at 13:27

0 votes

1 answer

30 views

SMOTE Oversampling for Text Classification with Multiple Input Features

SMOTE Oversampling for Text Classification with Multiple Input Features I have a text classification problem where the input has 2 features: a text and a language: the text is a string variable. the ...

Sandra Sukarieh

1

asked Jun 5 at 21:41

0 votes

0 answers

61 views

What feature selection method is best for a multi class classification problem with one-hot-encoded columns?

I am trying to solve a multi-class classification involving prediction the outcome of a football match (target variable = Win, Lose or Draw). With a dataset of 2280 rows, which is 6 seasons of ...

pastybake2002

1

asked Feb 1 at 20:26

0 votes

1 answer

78 views

Use prediction after using get_dummies in pandas?

I found similar question on this topic but no answer was helpful. I had a data frame with a categorical column in it with 5 different values. I used get_dummies and used linear regression for ...

Ali.A

73

asked Dec 8, 2023 at 5:39

1 vote

1 answer

77 views

Beginner basic clustering model and one-hot encoding?

I have a dataframe of natural disaster incidents in Afghanistan from 2016 - 2023. Column names: REGION (Northern, Eastern etc) PROV_CODE (province) PROV_NAME DIST_CODE (district) DIST_NAME INC_DATE (...

Mas

55

asked Nov 20, 2023 at 6:13

1 vote

0 answers

42 views

sklearn - OneHotEncoding and SelectPercintile

in sklearn example there is a code ...

Maciej778

11

asked Nov 12, 2023 at 15:40

0 votes

0 answers

24 views

Numerical issue with softmax regression implementation on MNIST

I'm having numpy numerical issues with my implementation of softmax regression/multiclass logistic regression on the MNIST dataset. The numpy exp and log numerical issue goes away when I divide the x ...

KaizerBox

1

asked Oct 29, 2023 at 3:30

0 votes

0 answers

9 views

Error while using saved logistic regression model on scoring vector data -The columns of A don't match the number of elements of x. A: 6011, x: 232964

0 I'm getting error while using saved logistic regression model on scoring vector data. SparkException: [FAILED_EXECUTE_UDF] Failed to execute user defined function (ProbabilisticClassificationModel$$...

Kunal Sinha

1

asked Oct 26, 2023 at 16:28

0 votes

1 answer

198 views

Best practices on encoding on an increasing number of categorical variables

I'm currently using Gradient Boosting Regressor as my model to predict production risk based off a set number of features as a side-project. One of these features, ...

Andrew Narvaez

3

asked Oct 22, 2023 at 20:16

0 votes

0 answers

71 views

Hot-encoding warning when using gridsearch

I ran an experiment with the classical holdout method to predict price and hot-encoded categorical data. However, when optimising, I got the warning below even though that I ignored the unknown ...

Aze

1

asked Sep 27, 2023 at 20:21

1 vote

0 answers

47 views

One-Hot encoded variables dominates importance among other variables

I am currently training some machine learning models to predict the 28-day compressive strength of cement, a continuous real-valued variable. The available dataset comprises samples from three ...

Felipe

21

asked Sep 13, 2023 at 5:07

1 vote

1 answer

38 views

How to prepare data if each item has multiple categories (like tags)

I'm working on a recommender system that will recommend movies to users. Movie ratings Movie User Rating 100 201 5 105 256 8 ... ... ... Movie tags Movie Tag 100 1 100 2 100 8 105 2 105 5 ....

Silver Light

131

asked Jun 5, 2023 at 14:17

0 votes

2 answers

272 views

How is PCA applied to (one-hot encoded) DNA sequence data?

I realize some questions have been asked already about one-hot encoding for PCA. The answer seems to be along the lines of 'The PCA will run, but does not necessarily make sense.' However, I have a ...

Chris_abc

51

asked May 5, 2023 at 22:02

1 vote

2 answers

1k views

Can decision trees handle Nominal Categorical variables?

I have read that decision trees can handle categorical columns without encoding them. However, as decision trees make splits on the data, how does it handle Nominal Categorical variables? Surely a ...

Connor

661

asked Apr 4, 2023 at 14:09

2 votes

1 answer

356 views

Multiple classes present in one-hot encoding

When dealing with classification for multiple classes present in the same sample, can the output layer have the form of one-hot encoding, but instead of only one hot, have multiple? That is, in case ...

smone

23

asked Mar 10, 2023 at 19:29

Stack Exchange Network

Questions tagged [one-hot-encoding]

How to create consistent dummy variables in Inference code?

SMOTE Oversampling for Text Classification with Multiple Input Features

What feature selection method is best for a multi class classification problem with one-hot-encoded columns?

Use prediction after using get_dummies in pandas?

Beginner basic clustering model and one-hot encoding?

sklearn - OneHotEncoding and SelectPercintile

Numerical issue with softmax regression implementation on MNIST

Error while using saved logistic regression model on scoring vector data -The columns of A don't match the number of elements of x. A: 6011, x: 232964

Best practices on encoding on an increasing number of categorical variables

Hot-encoding warning when using gridsearch

One-Hot encoded variables dominates importance among other variables

How to prepare data if each item has multiple categories (like tags)

How is PCA applied to (one-hot encoded) DNA sequence data?

Can decision trees handle Nominal Categorical variables?

Multiple classes present in one-hot encoding

Hot Network Questions

Questions tagged [one-hot-encoding]

Related Tags