Bias and Variance Decomposition to find an optimal tuning parameter - Population vs Empirical

Ask Question

Asked 8 months ago

Modified 8 months ago

Viewed 22 times

I have a question regarding to this bias and variance decomposition in The Elements of Statistical Learning. In chapter 7.2, it mentioned $\operatorname{Err}\left(x_0\right)=$

$$E\left[\left(Y-\hat{f}\left(x_0\right)\right)^2 \mid X=x_0\right]=\sigma_{\varepsilon}^2+\left[\mathrm{E} \hat{f}\left(x_0\right)-f\left(x_0\right)\right]^2+E\left[\hat{f}\left(x_0\right)-\mathrm{E} \hat{f}\left(x_0\right)\right]^2 =\sigma_{\varepsilon}^2+\operatorname{Bias}^2\left(\hat{f}\left(x_0\right)\right)+\operatorname{Var}\left(\hat{f}\left(x_0\right)\right)$$

While keep $x_0$ fixed, this can be viewed as a population level expected prediction error. Later when the textbook talks about how to choose an optimal parameter analytically (prepare for C_p, AIC, etc), it talks about estimating this expected prediction error using the training sample, but no bias and variance decomposition was done there.

I wonder if bias and variance decomposition is only some analysis worth talking about in the population level?

In this Bias-Variance Decomposition in Ridge Linear Regression, the decomposition has been intensively discussed. At the end, one can naturally optimize the sum of squared bias and variance to get an optimal penalty parameter $\lambda$. However, as the author mentioned, the result relies on couple of unknown (true) quantities.

So far it seems like one can find a feasible optimal $\lambda$ by optimizing the estimated expected prediction error, but cannot find a feasible optimal $\lambda$ from the bias variance decomposition, which is an analysis done worth only on the population level. Is that correct?

asked Oct 31, 2023 at 18:16

maskeran

5732 silver badges12 bronze badges

$\begingroup$ If your primary aim is to minimise the sum or average of the squares of the prediction errors, then it seems a natural approach to try to address that directly (since you know the prediction from a model using a particular $\lambda$ and you know the actual value) rather than trying to estimate the bias and variance separately, neither of which you can find. $\endgroup$
– Henry
Commented Nov 1, 2023 at 1:23

Add a comment |

Stack Exchange Network

Bias and Variance Decomposition to find an optimal tuning parameter - Population vs Empirical

0

You must log in to answer this question.

Browse other questions tagged
statistics
machine-learning
regression
.

Hot Network Questions

Bias and Variance Decomposition to find an optimal tuning parameter - Population vs Empirical

0

You must log in to answer this question.

Browse other questions tagged statisticsmachine-learningregression.

Related

Hot Network Questions

Browse other questions tagged
statistics
machine-learning
regression
.