6
$\begingroup$

I am trying to figure out how to simulate bootstrap samples from a dataset with unbalanced clusters. The approach I would like to adopt is non-parametric pairs bootstrap, which easily allows to maintain the dependence structure of the clusters.

Suppose for a moment that data were balanced (e.g., 500 mothers, each with 2 children). The two-level simulation algorithm with B iterations would be:

For $b = 1,\dots, B$,

  • Sample 500 mothers with replacement.
  • Sample 2 children without replacement.

Hence, both the clusters internal composition is maintained unaltered with respect to the initial sample and the final sample size is equal to the one of the original dataset ($N = 1000$).

Now, suppose that some mothers has 3 children. This implies that by adopting the above strategy the final simulated sample in general will not be composed by 1000 observations.

To your knowledge, are there statistical issues in this second case? If so, how would you proceed?

After having read on Davidson book 1 that the unbalanced clusters case would require more advanced techniques I made an extensive bibliographic research, but I found little or nothing about it in terms of simulation algorithms.

UPDATE

For the actual clustered bootstrap implementation in R, see this question.

1 Davidson, A. C., Hinkley, D. V. (1997). Bootstrap methods and their applications. Cambridge University Press.

$\endgroup$
1

1 Answer 1

5
$\begingroup$

With clustered data, you have 500 degrees of freedom, anyway. It does not matter that your nominal sample size may be 1005 or 1320 or whatever the number will be. The sampling variance of your estimates will generally improve only to the extent that you increase the number of clusters. So I would not see the random sample size as an issue.

I have written cluster bootstrap code in Stata, see http://www.stata-journal.com/article.html?article=st0187.

$\endgroup$
2
  • $\begingroup$ to your knowledge has there been any update in code to do this in R? $\endgroup$
    – RNB
    Commented May 30, 2017 at 8:56
  • $\begingroup$ You can check this question for the implementation in R $\endgroup$ Commented Jun 1, 2017 at 23:05

Not the answer you're looking for? Browse other questions tagged or ask your own question.