I have a data set with N ~ 9000 and about 50% missing on at least one important variable. There are more than 50 continuous variables and for each variable, the values after 95th percentile seems drastically huge from the values in the previous percentiles. So, I want to cap the variables at their respective 95th percentiles. Should I do it before or after train-test split?
I am of the view that I should do it after but one concern is the extreme values might not appear in the training dataset to begin with.
I'm working on Python