4
$\begingroup$

Gelman & Hill (pp. 252-259) discuss "no-pooling" (single-level), and "partial-pooling regression" (multi-level) with no predictor ($section~ 12.2$).

In almost all mixed-effects models (i.e, partial-pooling) texts (e.g., pp. 4-6 this book), one of the advantages of such methods is said to be their larger $SE$ (standard error) for regression coefficient estimates compared to those from their NON-multi-level peers.

Question: Below, I'm comparing partial_ and no_pooling models. However, I see that the partial_pooling model has a far smaller $SE$. I wonder why I'm seeing the opposite?

set.seed(0)                            # Make the following reproducible
groups <- gl(20, 10)                   # 20 grouping indicators each of length 10 (20 classes each with 10 students)
design <- model.matrix(~groups-1)      # Design matrix
   U0j <- rnorm(20, 0, 20)             # Random intercept deviations each for a classroom
   eij <- rnorm(length(groups), 0, 30) # Common error term for observations
     y <- 1629 + design%*%U0j + eij    # Response variable

#=====Analysis:

no_pooling <- lm(y~groups-1)
(SE_no_pooling <- sqrt(diag(vcov(no_pooling))))

#> 8.864905 # for all groups

partial_pooling <- lmer(y~ 1 + (1|groups))
(SE_partial_pooling <- sqrt(diag(vcov(partial_pooling))))
 
#>  0.2443936 # for intercept
$\endgroup$

1 Answer 1

1
$\begingroup$

I think that you might be confusing "no pooling" and "complete pooling." The former is represented by the no_pooling model and is an alternative way to deal with multilevel data by treating the clusters as a fixed population rather than as a random sample of similar clusters, which is the case in partial_pooling. In a complete pooling model, cluster membership is ignored. Such a model would be as follows:

Call:
lm(formula = y ~ 1, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-73.903 -23.997   0.006  21.714  98.714 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 1628.976      2.383   683.6   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 33.7 on 199 degrees of freedom

The standard error for the intercept is 2.383. In contrast, the standard error for the intercept in the partial_pooling model is 4.716:

Linear mixed model fit by REML ['lmerMod']
Formula: y ~ 1 + (1 | groups)
   Data: df

REML criterion at convergence: 1929.7

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.1039 -0.7621 -0.1037  0.6983  2.8887 

Random effects:
 Groups   Name        Variance Std.Dev.
 groups   (Intercept) 366.2    19.14   
 Residual             785.9    28.03   
Number of obs: 200, groups:  groups, 20

Fixed effects:
            Estimate Std. Error t value
(Intercept) 1628.976      4.716   345.4

Thus the complete pooling model, by ignoring correlations of y-values within clusters, assumes that all individuals are independent. In so doing, it estimates a standard error consistent with such an assumption. The partial_pooling model is designed for this problem and such, appropriately adjusts the standard error estimate by differentially weighting the sample size. I will try to come back and put in the different standard errors calculations for the three models.

Edit: The three standard errors, as promised. These are for the balanced case where $n_j=n$ and $J$ is the number of clusters. $\hat\psi$ is the level 2 between-cluster variance and $\hat\theta$ is the level 1 within-cluster variance. The mixed model $\widehat{SE}$ will vary slightly for unbalanced group sizes:

$\widehat{SE}(\hat{\beta}^{OLS}) \approx \sqrt{\dfrac{\hat\psi + \hat\theta}{Jn}}$

$\widehat{SE}(\hat{\beta}^{Mixed}) = \sqrt{\dfrac{\hat\psi + \dfrac{\hat\theta}{n}}{J}}$

$\widehat{SE}(\hat{\beta}^{NoPool}) = \sqrt{\dfrac{\hat\theta}{Jn}}$

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.