Timeline for Imputation before or after splitting into train and test?
Current License: CC BY-SA 3.0
9 events
when toggle format | what | by | license | comment | |
---|---|---|---|---|---|
Jun 30, 2019 at 12:39 | vote | accept | Peter Flom | ||
Jun 29, 2019 at 20:08 | answer | added | ALEX.VAMVAS | timeline score: 2 | |
May 22, 2018 at 22:37 | answer | added | chen636489 | timeline score: 4 | |
Apr 24, 2014 at 20:15 | comment | added | Peter Flom | I don't like binning continuous variables. | |
Apr 24, 2014 at 20:10 | comment | added | RobertF | Ah. I get nervous using imputation. I wonder about the merits of having a continuous variable with 50% values imputed vs. converting the cont. variable to categorical with a 'Missing' category plus enough bins to capture the behavior the non-missing values? | |
Apr 24, 2014 at 20:05 | comment | added | Peter Flom | No one variable has 50% missing, but about 50% is missing on at least one. Also, they are continuous, so "missing" would mess things up. | |
Apr 24, 2014 at 19:39 | comment | added | RobertF | 50% missing values for a crucial variable? Ugh. Rather than impute, why not create a 'Missing' category for the variable? | |
Apr 24, 2014 at 19:05 | answer | added | Henry | timeline score: 42 | |
Apr 24, 2014 at 18:55 | history | asked | Peter Flom | CC BY-SA 3.0 |