0
$\begingroup$

While fitting linear mixed models, I would like to avoid zero random-effects (ranef(model)) and cluster-level SD estimates (as.data.frame(VarCorr(model))[1,5]). These can be produced by the lme4 R package in certain two-level scenarios, even though the simulated sampled populations are clustered and heterogeneous. Typical signs for the zero outputs are the singular fits. Another frustrating feature of the lme4 models is that the rate of singular fits seems to increase in some cases with the number of clusters. It's counter intuitive that with more inputted information the models give up and do not estimate any cluster-level variation, even though the simulated sampled populations are heterogeneous (like ICC 0.05). As a specific background detail, I would like to have at least tiny ranef(model) and as.data.frame(VarCorr(model))[1,5] even if the populations are truly homogeneous (ICC 0.00). This is for applying a particular bootstrapping procedure on the fitted models, which requires non-zero values. My target model is continuous_dependent ~ dicthomous_predictor + (1|cluster).

As a suggested alternative in the lme4 documentation to manage these issues, one can use the blme R package cited as "partially Bayesian method". After testing, I can verify that with the default prior over the covariance of the random effects/modeled coefficients (the cov.prior = wishart argument) the overall success rate of the model fitting improved significantly with the blme package. By success I mean non-zero ranef(model). However, the cluster-level SD estimates go off the chart with small number of clusters like two, both with the cov.prior = wishart and cov.prior = gamma (let's rule out the basic suggestion "acquire more measurements/clusters" at this point). cov.prior = invwishart and cov.prior = invgamma prominently underestimated the cluster-level SDs with small number of sampled clusters when I simulated heterogeneous populations (like ICC 0.10-0.98). On the other hand, the cluster-level SD estimates from the lme4 models that survived the different number of clusters, cluster sizes, and ICCs without singular fits were more balanced, better matching with the true cluster-level population SDs in general across the simulations I made.

Thus, I would be happy with the non-zero cluster-level estimates of the non-singular lme4 models, if their rate would be much higher. That is why I would like to find an appropriate cov.prior for the blme models enabling non-zero cluster-level estimates similar to the ones of the non-singular lme4 models with improved success rate.

Here is an example from the blme documentation how to define a custom prior:

# Custom prior
penaltyFn <- function(sigma)
dcauchy(sigma, 0, 10, log = TRUE)
(fm5 <- blmer(Reaction ~ Days + (0 + Days|Subject), sleepstudy,
cov.prior = custom(penaltyFn, chol = TRUE, scale = "log")))

Please correct me if I am wrong. I am just getting familiar with the concept and haven't find good introductory materials. Would it be a reasonable approach, and could this penaltyFn be used to define the probability of finding the cluster means in relation to the grand mean, lower or higher similar to tossing a coin? Thus, if we assume that the populations are normally distributed, could we replace the dcauchy() with dbinom() like this:

penaltyFn <- function(sigma) {
dbinom(sigma, number_of_clusters, 0.5)
}
model <- blmer(continuous_dependent ~ dicthomous_predictor + (1|cluster), data,
cov.prior = custom(penaltyFn)))

Does the principle make sense, and how about the implementation? Or could somebody suggest a better approach regarding fitting continuous_dependent ~ dicthomous_predictor + (1|cluster) models and other provided context? The current results are not perfect, but according to tests the dbinom() prior enabled both the improved success rate of obtaining the non-zero ranef(model) and cluster-level SD estimates closer to the true ones across the simulations. One downside is that if the simulated sampled populations were truly homogeneous (ICC and thus cluster-level SD both zero), even with 30 clusters and cluster size of 30 the mean of the cluster-level SD estimates settled to 0.62 with 10 simulation iterations with blme, while the mean of the lme4 estimates settled to 0.15 (only 3/10 successful iterations). Non-zero estimates are wrong with homogeneous data regarding the accuracy of the analysis, but necessary in practice regarding the following bootstrapping I use. With that many measurements of minimally varying clusters, I wonder why the dbinom() prior lead to the estimation of higher cluster-level SD? It is also possible that there is no sense in my logic and/or implementation.

Thanks.

$\endgroup$
4
  • $\begingroup$ With the caveat that I don't find your argument for why you are looking to do this analysis convincing, perhaps these prior choice recommendations (from the Stan team) might be helpful. You can also do full Bayesian analysis with Stan / brms. $\endgroup$
    – dipetkov
    Commented Apr 14 at 18:02
  • $\begingroup$ Thank you very much @dipetkov, I'll look into that. One thing on my mind actually was that how related the full-Bayesian priors are with the blme cov.priors? Could I widen my search and perhaps look for brms tutorials, or as you proposed, Stan sources in general? With dbinom prior my motivation was to input minimal justifiable information to help the models. I assumed that the prior relates to the probability distribution of the cluster-level residuals around the grand mean. The dbinom itself requires minimal input, while dnorm requires SD estimate right from the beginning which I don't have. $\endgroup$
    – Imsa
    Commented Apr 15 at 3:41
  • $\begingroup$ I don't know {blme} and just wanted to point out some alternatives for Bayesian analysis. {brms} is designed to have a similar formula interface as {lme4} but still there'll be a lot to learn to use it and the related diagnostics tools. Also it's easier (I think) to get help if you can provide data (real or simulated) with the question $\endgroup$
    – dipetkov
    Commented Apr 15 at 5:49
  • $\begingroup$ @dipetkov cheers. Excluding my detailed other explanations, it would be a good start if someone could explain in layman's terms, what actually the cov.prior does in the blme. I improvised the dbinom prior rather quickly and was preliminarily happy about how the model fittings and estimates turned out. However, without deeper, easy-to-digest information about what actually happens under the hood I cannot rely on the current implementation. $\endgroup$
    – Imsa
    Commented Apr 15 at 10:02

1 Answer 1

0
$\begingroup$

Here are some experiments and observations related also to this discussion, explaining what the sigma variable actually is dealt within the cov.prior.

The divergent idea about using dbinom() or similar priors proposed in this question should be implemented using priors for the fixed effects, as far as I understand the problem currently. However, the blmer() does not enable using custom priors for the fixed effects. Thus, I believe that the implementation of my preliminar idea is not possible using the current blme package. If someone is interested in further development of the approach, I would like to be contacted for possible cooperation.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.