I am working, using Python, on a Random Forest Regression for the prediction of a target variable. I have trained it and tested it on real data, obtaining satisfying results. Now, I would like to explore different possible scenarios to understand how, by changing the other variables, the target one would be modified. Can I test the RF model on synthetic data if I trained it on real data?
I have attempted to compute this simulated data by multiplying some variables of the real test dataset by chosen (by me) indexes. For example, by increasing variables A and C by 10%.
Is this approach of mixing real data for training and simulated data for testing acceptable?