Skip to main content

All Questions

0 votes
0 answers
11 views

Cumulative feature importance in Random Forest taking into account past days data

I have a dataframe with past days data and current day data. Example columns [ cases , mobility, temp , rh , cases_1, mobility_1 , temp_1 , rh_1, cases_2, mobility_2, temp_2, rh_2 and so on. . ]. My ...
SHARVARI WANJARI's user avatar
0 votes
0 answers
21 views

dummy features has almost the same effect as actual features

4 dummy random features (using np.random.randn) and 4 new real features (brough from some ideas) shows almost the same improvement. In cross validation, 4 dummies ...
Crispy13's user avatar
  • 133
0 votes
0 answers
13 views

Estimating the Increase in Rademacher Complexity after Feature Selection

I'm trying to estimate how much the Rademacher complexity (or empirical Rademacher complexity) increases when performing feature selection using methods like Sequential Forward Selection or Genetic ...
x H's user avatar
  • 1
0 votes
0 answers
15 views

Whats a suitable feature selection method for Time series data across multiple files?

My problem is basically a higher dimensional regression, where my input is (100 levels, 300 timesteps, 23 features) My goal is to build a deep learning LSTM model that finds which level the data ...
Youssef Badr's user avatar
0 votes
0 answers
10 views

Feature Selection in no labeled data

I'm new to this field and trying to learn by working with a fraud dataset. Initially, I used the dataset as is, but now I'm trying unsupervised learning without the labels. I've tried clustering ...
DrGenius's user avatar
  • 101
0 votes
0 answers
28 views

Questions about the process of feature selection through feature importance

'Shap feature importance' was obtained through xgboost, and variables with the lowest feature importance were removed one by one from 50 variables until only 1 variable remained. As a result of ...
JAE's user avatar
  • 13
0 votes
0 answers
61 views

What feature selection method is best for a multi class classification problem with one-hot-encoded columns?

I am trying to solve a multi-class classification involving prediction the outcome of a football match (target variable = Win, Lose or Draw). With a dataset of 2280 rows, which is 6 seasons of ...
pastybake2002's user avatar
0 votes
0 answers
9 views

How to find the minimum data point that predicts the target class in longitudinal data

I am working on medical data where a screening is done regularly for 200 days. I need to know the minimum number of screenings that can predict the outcome. I also need to know the best time/times to ...
Ghof-90's user avatar
0 votes
0 answers
38 views

What feature selection method is ideal for a large dimensional data frame after the result of one hot encoding?

I am trying to solve a sports related multi class classification problem in Python, I aim to train a custom neural network and also a SVM. I have performed prior data cleaning and encoded my data ...
pastybake2002's user avatar
0 votes
0 answers
97 views

How to do feature selection correctly in xgboost for time series forecasting after obtaining a good predictive model?

I have a very large dataset (~7 million rows) for which I have extracted ~500 features during feature engineering phase. I have trained an XGBoost which has a fairly good predictive capability (based ...
guestar's user avatar
0 votes
0 answers
71 views

Feature selection: ANOVA between features vs within a feature

I am currently performing feature selection on a dataset containing continuous and categorical features. The target is a continuous variable. If I understand properly, ANOVA can be used between ...
Fred vh's user avatar
1 vote
0 answers
42 views

sklearn - OneHotEncoding and SelectPercintile

in sklearn example there is a code ...
Maciej778's user avatar
0 votes
0 answers
24 views

Feature selection for siamese network

I have a regression problem for which two observations are compared by a siamese-like Multilayer Perceptron. Each observation 'O' is described by a feature vector 'X' of a certain number 'N' of ...
Febo Cardelli's user avatar
0 votes
0 answers
14 views

How is it called when instead of creating predective models finding patterns in observed data (ML) you tried to guess the model theorically...?

I'm a college student appasionated of machine learning and I've decided to my bachelor thesis about it. I thought that as an interesting introduction to machine learning, I could introduce it by ...
ADayWithoutRain's user avatar
0 votes
1 answer
18 views

How to represent facial features from video and classify high/low personality traits from facial features?

The dataset has 3-minute 30fps video conversations (no audio) of 150 extroverted and 150 introverted individuals. The goal is to classify them as "introverts" or "extroverts" based ...
TheBiometricsGuy's user avatar

15 30 50 per page
1
2 3 4 5
27