I have a simple yet tricky conceptual question about the data splitting of a meta learning process.
Assume I have a simple X_train
, X_test
split on which I trained and tunedmodel_1
and model_2
. Now I want to stack them using stacker_0
. What I envisage to do is :
Split the X_train
in 5 folds
$F_{i=0}^4$, then train model_1
and model_2
on $F_{i, i\neq j}$ and predict on $F_j$. Then I will have a new dataset X_train'
than I can use to train my meta model without leakage.
My question now is to know if I can use this X_train'
to do usual model selection stuff (i.e hyperparameter tuning, meta-model validation, etc.) on stacker_0
. Would it be fair ?