11
$\begingroup$

I'm under the impression that when you bootstrap, your final results are the original statistic from your sample data, and the standard errors from the bootstrapped trials. However, it seems more intuitive to take the mean statistic from all your trials, rather than just the statistic from the original trial. Is there some statistical intuition why it is one and not the other?

Also, I came across a use case where someone uses bootstrapping using the mean as the statistic. They did their sampling, took the mean of each trial, and used that to calculate the confidence interval around the mean. Is this ok? It seems like you could draw confidence intervals using the original data itself, and bootstrapping would artificially lower the standard errors. Again, is there some intuition I could use to understand why this is ok/not ok?

$\endgroup$
1
  • 4
    $\begingroup$ Note that when relying on bootstrap, you use your sample as the (best hypothetical) approximation of the target popuation, hence it makes sense to consider the observed mean, or the "original statistic" as you say --- moreover, it allows to correct for possible bias. Then you reconstruct the sampling distribution of that statistic (arithmetic mean) under sampling with replacement, $\endgroup$
    – chl
    Commented Oct 30, 2020 at 12:29

4 Answers 4

7
$\begingroup$

The idea of the bootstrap is to estimate the sampling distribution of your estimate without making actual assumptions about the distribution of your data.

You usually go for the sampling distribution when you are after the estimates of the standard error and/or confidence intervals. However, your point estimate is fine. Given your data set and without knowing the distribution, the sample mean is still a very good guess about the central tendency of your data. Now, what about the standard error? The bootstrap is a good way getting that estimate without imposing a probabilistic distribution for data.

More technically, when building a standard error for a generic statistic, if you knew the sampling distribution of your estimate $\hat \theta$ is $F$, and you wanted to see how far you can be from it's mean $\mu$, the quantity $\hat \theta$ estimates, you could look at the differences from the mean of the sampling distribution $\mu$, namely $\delta$, and make that the focus of your analysis, not $\hat \theta$

$$ \delta = \hat \theta - \mu $$

Now, since we know that $\hat \theta \sim F$, when know that $\delta$ should be related with $F$ minus the constant $\mu$. A type of "standardization" as we do with the normal distribution. And with that in mind, just compute the 80% confidence interval such that

$$ P_F(\delta_{.9} \le \hat \theta - \mu \le \delta_{.1} | \mu) = 0.8 \leftrightarrow P_F(\hat \theta - \delta_{.9} \ge \mu \ge \ \hat \theta - delta_{.1} | \mu) = 0.8 $$

So we just build the CI as $\left[\hat \theta - \delta_{.1}, \hat \theta - \delta_{.9} \right]$. Keep in mind that we don't know $F$ so we cant know $\delta_{.1}$ or $\delta_{.9}$. And we don't want to assume that it is normal and just look at the percentiles of a standard normal distribution either.

The bootstrap principle helps us estimate the sampling distribution $F$ by resampling our data. Our point estimate will be forever $\hat \theta$. There isn't anything wrong with it. But if I take another resample I can built $\hat \theta^*_1 $. And then another resmple $\hat \theta^*_2 $. And then another $\hat \theta^*_3 $. I think you get the idea.

With a set of estimates $\hat \theta^*_1 ... \hat \theta^*_n$ has a distribution $F^*$ which approximates $F$. We can then compute $$ \delta^*_i = \hat \theta^*_i - \hat \theta $$

Notice that the point estimate for the $\mu$ is replaced by our best guess $\hat \theta$. And look at the empirical distribution of $\theta^*$ to compute $\left[\hat \theta - \delta^*_{.1}, \hat \theta - \delta^*_{.9} \right]$.

Now, this explanation is heavily based on this MIT class on the bootstrap. I highly recommend you give it a read.

$\endgroup$
3
  • 2
    $\begingroup$ Please consider editing your answer to clarify what you mean by the statement "your point estimate is fine." One of the things you nicely demonstrate is how the bootstrap can be used to estimate (and thus correct for) the bias in the point estimate. Also, there might be some difference in terminology between what's in the MIT course to which you link and others' terminologies; see this page (linking to the same course) for details, and for extensive discussion on different ways to use bootstrapping to estimate confidence intervals. $\endgroup$
    – EdM
    Commented Oct 30, 2020 at 16:21
  • $\begingroup$ Thanks for the response. I'd like to second the request made in the other comment -- what do you mean by the "point estimate is fine"? Specifically, if I was to argue against someone who used the mean of the resampled estimates as a "new" estimate, what reasoning could i point to, to say why that doesnt make sense? Also, is it true, as another commenter said, that the mean of the resampled statistics tends towards the original point estimate? It seems intuitive. $\endgroup$
    – Keshinko
    Commented Nov 2, 2020 at 3:04
  • 1
    $\begingroup$ Nevermind, i found a very nice explanation here $\endgroup$
    – Keshinko
    Commented Nov 2, 2020 at 3:29
15
$\begingroup$

That's not OK. You would need to use the double bootstrap to get a correct confidence interval from a new estimator that is a function of many bootstrap estimates. The bootstrap was not created to provide new estimators, except in rare cases such as the Harrell-Davis quantile estimator. The main function of the bootstrap is to study the performance of an existing estimator, or to tell how bad the estimator is (e.g., in terms of variance or bias). The bootstrap can also provide confidence intervals for strange quantities such as the number of modes in a continuous distribution.

$\endgroup$
7
$\begingroup$

The reason you typically take the statistic calculated from all data as your point estimate is that (at least for a mean) with the number of bootstrap samples going to infinity, you will get that same answer. I.e. any deviation is just due to the number of bootstrap samples and you might just as well use the known exact answer.

In the second part of your question, what do you mean by calculating the confidence around the mean "using the original data"? The main reason you use boostrapping is usually that there's no simple formula for just getting a CI from the original data. If you mean taking the variation in the original data (e.g. take 1.96 $\times$ SD of original data), then that's not a confidence interval for the mean, but rather an interval that also describes the variation in the outcome.

$\endgroup$
1
  • 1
    $\begingroup$ A below answer (and my understanding of bootstrapping) says that the bootstrap samples will approximate the distribution of the statistic. Are you saying that the original statistic (in this case the mean) is always at the mean of the bootstrapped distribution? then how would that bias adjusting method in the other answer work? $\endgroup$
    – Keshinko
    Commented Oct 31, 2020 at 0:27
0
$\begingroup$

On the first question: if the statistic you are interested in is not the mean, then there are cases where taking the mean statistic from all the resampling trials is arguably better than taking the single statistic from the original trial.

For example, suppose you are interested in the median of a distribution. The distribution turns out to be bimodal with narrow peaks at 0 and 1. You have 99 points in your sample, of which 50 are near 0 and 49 are near 1. It's too close to call whether the population median is nearer 0 or 1. Your sample median is close to 0, but if you wanted to minimise the mean squared error of your estimate of the population median, you would want your estimate to be something close to 0.5.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.