Skip to main content

Questions tagged [categorical-data]

Categorical data can take on a limited (usually fixed) number of possible values called categories. Categorical values "label", they do not "measure". Nominal and dichotomous/binary scale types are categorical. Some people consider ordinal scale categorical too.

0 votes
0 answers
6 views

What's the right machine learning approach to mark rubrics based on sequences of data?

I'm a teacher and I'm working on a pet project to help streamline some of my assessment workflows for my students. One of those workflows is gathering data on student progress in the form of a rubric ...
Kevin's user avatar
  • 1
0 votes
1 answer
15 views

How does encoding categorical featured with Target Encoding variants work when the Target feature is continuous?

I have been reading about target encoding and its variants like Leave One Out, James Stein, etc. and in all cases, the target variable itself is usually binary (or can be divided into categories). How ...
Ahmed Farooqui's user avatar
1 vote
1 answer
29 views

Chi-square test results interpretation

I am comparing with Chi Square the distributions of two categorical variables. Both have the same number of classes. After counting each class per variable, I obtain very similar counts but the p-...
crbl's user avatar
  • 111
3 votes
1 answer
808 views

How does a Neural Net handle an unseen class for a Categorical Feature?

Let's say I train a Neural Net, and I have a Categorical Feature X. During training, there are only 3 classes seen in feature X; A, B, C. Now, let's say I want to make predictions from this trained ...
the man's user avatar
  • 139
0 votes
0 answers
18 views

Model Architecture for Time-Series Forecasting with Categorical and Multivariate Data

Context: I was looking at using an LSTM model to forecast the amount of gold gained for each of 10 heroes in a game of Dota 2, a MOBA game, as a base model in some type of model architecture. The game ...
DCRA's user avatar
  • 1
0 votes
1 answer
26 views

Dealing with only categorical features dataset

I'm trying to do multi-class classification on a labeled dataset with purely categorical features. There are around 30 features in total. 3 of the features in particular have around 100 unique values (...
Shaurya Uniyal's user avatar
0 votes
0 answers
3 views

Options for representing the following "conditional rules" in some data, and if categorical data science would be helpful?

I am new to data science, so I am hoping the following question is a reasonably elementary exercise for someone more experienced. Let us say I have $n$ categories of data. Each category is a ...
Julius Hamilton's user avatar
0 votes
0 answers
28 views

Visualise intersections of group membership (several low-cardinality variables)

I need to visualise joint and marginal frequencies of several low-cardinality categorical variables. Equivalently, I want to visualise sizes of groups and their intersections, where membership in some ...
paperskilltrees's user avatar
3 votes
0 answers
39 views

How to predict multi-variate time-series from different samples [closed]

I'm having issues seeing the best way to predict a time-series when training on a dataset with different samples. I have a dataset that shows the weight of 10 rabbits from their first day to their ...
scootjow's user avatar
1 vote
1 answer
38 views

How to deal with categorical disalignment in test and train in binary classification problems

I have a train and test datasets (600k observations) that have different categories for the same categorical variable. For example train has the categorical variable Letters having unique categories ...
kyara's user avatar
  • 13
0 votes
0 answers
19 views

Feature selection on datasets with both categorical and numerical features

I'm proposing a novel methodology for feature selection in the context of tabular datasets that contain both numerical and categorical features. In order to prove the efficacy of my methodology, I ...
Francesco De Santis's user avatar
4 votes
1 answer
853 views

Decision Tree only splits to the left

I can’t really understand, why my decision tree only splits to the left. I originally have 2 categorical features (further named feature 0 and 1), which I concat to one feature since feature 1 is ...
Taitex's user avatar
  • 41
0 votes
0 answers
13 views

classification using simple relationships between time series data

I am looking to predict which courses are taught by which university professors at my school. More specifically, for each semester and professor I want to know the probability breakdown of which ...
retep's user avatar
  • 101
0 votes
0 answers
67 views

Combining Textual, Categorical and Numerical data for Semantic Search using SentenceTransformers model

I'm building a food semantic search model and I want to use a pre-trained SentenceTransformers model with cosine similarity. I'm using Epicurious dataset for the corpus which consists of textual (&...
Alex's user avatar
  • 1
0 votes
0 answers
60 views

Training Biased/Uneven Categorical Data with CatBoost, Unbalanced/Unseen Categories Handling

Summary: I am training a discount eligibility model where the dataset represents historical data for products where people availed discounts based on simple features like product category, discount ...
glory9211's user avatar
  • 101

15 30 50 per page
1
2 3 4 5
27