Show bagging helps under squared-error loss

Ask Question

Asked 9 months ago

Modified 9 months ago

Viewed 36 times

This question is about chapter 8.7 Bagging from Element of Statistical Learning (ESL) textbook.

Assume our training observations $\left(x_i, y_i\right), i=1, \ldots, N$ are independently drawn from a distribution $\mathcal{P}$, and consider the ideal aggregate estimator $f_{\text {ag }}(x)=\mathrm{E}_{\mathcal{P}} \hat{f}^*(x)$. Here $x$ is fixed and the bootstrap dataset $\mathbf{Z}^*$ consists of observations $x_i^*, y_i^*, i=1,2, \ldots, N$ sampled from $\mathcal{P}$. (Note that $f_{\text {ag }}(x)$ is a bagging estimate, drawing bootstrap samples from the actual population $\mathcal{P}$ rather than the data. It is not an estimate that we can use in practice, but is convenient for analysis. ) The author wrote $$ \begin{aligned} \mathrm{E}_{\mathcal{P}}\left[Y-\hat{f}^*(x)\right]^2 & =\mathrm{E}_{\mathcal{P}}\left[Y-f_{\mathrm{ag}}(x)+f_{\mathrm{ag}}(x)-\hat{f}^*(x)\right]^2 \\ & =\mathrm{E}_{\mathcal{P}}\left[Y-f_{\mathrm{ag}}(x)\right]^2+\mathrm{E}_{\mathcal{P}}\left[\hat{f}^*(x)-f_{\mathrm{ag}}(x)\right]^2 \\ & \geq \mathrm{E}_{\mathcal{P}}\left[Y-f_{\mathrm{ag}}(x)\right]^2 \end{aligned} $$

But this relies on the assumption $$\mathrm{E}_{\mathcal{P}}[(Y-f_{\mathrm{ag}}(x))(f_{\mathrm{ag}}(x)-\hat{f}^*(x) )] = 0$$

The author later mentioned that the main caveat is "independent'' and the bagged trees are not. So it seems like the nice decomposition above is relied on the assumption that the bagged trees are independent.

However, from the expression $$\mathrm{E}_{\mathcal{P}}[(Y-f_{\mathrm{ag}}(x))(f_{\mathrm{ag}}(x)-\hat{f}^*(x) )] = 0,$$ I don't see it is the covariance between two bagged trees. (If it is, then based on independence assumption, clearly it is zero). To me it is more like the covariance between the $Y$ and $\hat{f}^*(x)$, if it is appropriate to think $f_{\mathrm{ag}}(x)) = \mathrm{E}_{\mathcal{P}}[Y | X=x]$.

edited Oct 7, 2023 at 21:47

asked Oct 2, 2023 at 18:54

Stack Exchange Network

Show bagging helps under squared-error loss

0

You must log in to answer this question.

Browse other questions tagged
statistics
machine-learning
.

Hot Network Questions

Show bagging helps under squared-error loss

0

You must log in to answer this question.

Browse other questions tagged statisticsmachine-learning.

Related

Hot Network Questions

Browse other questions tagged
statistics
machine-learning
.