Questions tagged [ensemble-learning]
In machine learning, ensemble methods combine multiple algorithms to make a prediction. Bagging, boosting and stacking are some examples.
475
questions
0
votes
0
answers
12
views
In X-learner uplift modeling, predictions from the 1st-stage models help train the 2nd-stage models. What data splits should these predictions be on?
In uplift modeling with an X-learner metalearner (Künzel et al. 2019), predictions from the two first-stage models are used in training the two second-stage models. Question: What datasets/splits ...
1
vote
1
answer
41
views
How can different models based on different sets of predictors be combined to significantly improve the model performance?
I have two machine learning models for predicting some continuous variable $y$, say $y=f_1(X_1, \theta_1)$ and $y=f_2(X_2, \theta_2)$, and these models are of the same type (ANN). $X_1$ and $X_2$ ...
0
votes
0
answers
14
views
How to split data when training and tuning the meta learner in stacking?
I have a simple yet tricky conceptual question about the data splitting of a meta learning process.
Assume I have a simple X_train, ...
0
votes
0
answers
15
views
Model parameter averaging in Bagging
I wonder if the following bagging method is used in practice, or at least any reference for this.
Assume that we sample (sub)-datasets from the original training set, and train $n$ many logistic ...
1
vote
1
answer
24
views
How do I calculate estimated variance for an ensemble forecast?
I have several (n) different forecasts of comparable quality for a variable, based on the same data but using wildly different statistical models. For each, I have generated an estimate for m periods ...
3
votes
1
answer
122
views
Gamma regression with XGBoost
I'll try to be brief. I have two questions about what exactly happens when I train a gradient boosted ensemble of trees using, say, XGBoost in order to perform a Gamma regression. I apologize in ...
0
votes
1
answer
35
views
Quantifying prediction uncertainty using deep ensembles: How to combine Laplace distributions?
For a regression problem, I want to train an ensemble of deep neural networks to predict the labeled output as well as the uncertainty, similar to the approach presented in the paper Simple and ...
1
vote
0
answers
44
views
Prediction vs confidence intervals using random forest / an ensemble of estimators
Given a random forest (or any other ensemble) where each of the $i=1..n$ trees/base estimators is trained by minimizing the mean squared error, then each tree/base estimator prediction $\hat{Y}_i(x) =...
1
vote
0
answers
33
views
ML Modelling advice where a feature is partially missing but highly informative when present
I am building a model to predict a customer purchase event on a website. Specifically for those customers who, overnight when the model is run, have not yet purchased. Prediction is important, but ...
1
vote
0
answers
27
views
XGB predict_proba estimates don't match sum of leaves [closed]
When using an XGB model in the context of binary classification, I observed that the test estimates given by predict_proba were close but not equal to the results I ...
1
vote
3
answers
108
views
Combining regression models based on missing data patterns
I have a dataset that contains a few patterns of missingness. For this dataset, I have a training set that is complete and contains all input features. My test set has complete observations for the ...
3
votes
1
answer
86
views
Ensemble Methods for Probabilities
I am currently trying to build a stacked algorithm in order to determine how many people in each region of a country will be likely to buy a product versus its competitors. I have some data from an ...
0
votes
0
answers
29
views
Ensemble Random Forest Overfitting
I am running an ensemble random forest model (a newer method published in 2020). The model works by using a double bootstrapping step to balance imbalanced training data. Then you grow multiple ...
0
votes
1
answer
60
views
Bagging Ensemble Math
You are working on a binary classification problem with 3 input features and have chosen to apply a bagging algorithm (Algorithm X) on this data. You have set max_features = 2 and n_estimators = 3. ...
1
vote
0
answers
26
views
Cross validation + model stacking with hyperparameter tuning while sharing data?
Let's say we want to stack 2 base models: an XGBoost regressor and a deep neural network by linearly combining their predictions as ...