3
$\begingroup$

I am confused by the likelihood ratio test's boundary condition limitation. A commonly stated is that it causes problem for variance parameter because it is bounded below by 0. Can these models compared with the likelihood ratio test

$Y\sim \text{Normal}(a,2)$
$Y\sim \text{Normal}(a,b)$

where, $a$ and $b$ are parameters? How about these

$Y\sim \text{Normal}(a,b)$
$Y\sim \text{Normal}(a,b+cx)$

where $a$, $b$, and $c$ are parameters, and $x$ is a covariate? If the likelihood ratio test cannot be used, how should they be compared using a null hypothesis significance testing?

A related question is that I think the likelihood ratio test is commonly used in binomial GLMs even though the probability parameter is bounded below and above. Why is it ok to use it for the probability parameter?

$\endgroup$
3
  • 1
    $\begingroup$ For your second question, re the GLM, you're not testing the probability, but model parameters, which on the link scale are unbounded. $\endgroup$ Commented May 28 at 5:52
  • $\begingroup$ @Gavin Simpson Then if we use the log link for the variance of the normal distribution, parameter will become unbounded and the problem is solved? $\endgroup$
    – quibble
    Commented May 28 at 5:56
  • 1
    $\begingroup$ Yes; there wouldn't be a problem if you were fitting this as a distributional model / LSS model and your parameter is interest was estimated on the log scale (except you do have to be a little bit careful to ensure the estimated variance is not 0) $\endgroup$ Commented May 28 at 13:36

1 Answer 1

5
$\begingroup$

The problem happens only when the boundary point is close enough to the null hypothesis to be relevant to the distributions.

Asymptotically, that means the null hypothesis just needs to be off the boundary: the first example would work as long as $b>0$ and the second example would work as long as $\min_x (b+cx)$ is greater than zero (rather than equal to zero). As $n$ increases, the distributions of $\hat b$ (or $\hat b$ and $\hat c$) would concentrate closer and closer to the true value. If $b=0.1$ and $se(\hat b)=0.001$ then the fact there's a boundary ten standard errors away isn't going to mess up inference; 10 standard errors away is well over the horizon.

In finite samples you want a bit more than that. In the first example you want $b$ to be large enough that 'most' of the distribution of $\hat b$ around $b$ is away from the boundary point. In the second example you want 'most' of the distribution of $\min_x (b+cx)$ to be away from zero.

If in fact $b>0$ you could also make the problem go away by reparametrising to $\gamma=\log b$. It's a little more tricky to reparametrise for the second example. It's important to note that reparametrising only helps if in fact $b=0$ is impossible. If $b=0$ is possible then the problem genuinely has a boundary problem and there's no way to make it go away.

$\endgroup$
2
  • $\begingroup$ Reparametrizing a parameter does not affect the likelihood, so I do not know how it may help. Also, how whether or not $b=0$ is possible is determined? For example, in binomial GLMs, $p=0$ and $p=1$ are impossible is somehow determined, and thus the use of LRT has no problem? $\endgroup$
    – quibble
    Commented May 28 at 7:10
  • 3
    $\begingroup$ It doesn't help fundamentally. It helps in two useful ways. First, computationally: it makes sure the boundary doesn't affect the optimisation of the likelihood. Second, a scale on which the parameter is unbounded is likely to give better Normal approximations. This doesn't matter so much for the likelihood ratio test but does for other aspects of inference $\endgroup$ Commented May 28 at 7:19

Not the answer you're looking for? Browse other questions tagged or ask your own question.