0
$\begingroup$

I have a simple yet tricky conceptual question about the data splitting of a meta learning process.

Assume I have a simple X_train, X_test split on which I trained and tunedmodel_1 and model_2. Now I want to stack them using stacker_0. What I envisage to do is :

Split the X_train in 5 folds $F_{i=0}^4$, then train model_1 and model_2 on $F_{i, i\neq j}$ and predict on $F_j$. Then I will have a new dataset X_train' than I can use to train my meta model without leakage.

My question now is to know if I can use this X_train' to do usual model selection stuff (i.e hyperparameter tuning, meta-model validation, etc.) on stacker_0. Would it be fair ?

$\endgroup$

0