4
$\begingroup$

I'm trying to polish my stats skills and it seems to me that you have either parametric test for normal data, or non-parametric tests for non-normal data.

Looking at the t-test for instance, I don't really see the reason why a similar derivation could not be done for other known distributions. I guess some of them might be hard to do analytically, but since we have computers anyway, that should not be really a problem.

So I'm guessing that there are such tests, but they are not really taught and/or are impractical for one or more reasons . Can someone enlighten me?

$\endgroup$
1
  • 1
    $\begingroup$ I don't think this is a black or white type of problem. Many parametric tests are robust to minor or even moderate violations of the normality assumption of the data or model residuals and can still be used. It's a matter of quantum! $\endgroup$ Commented Sep 24, 2018 at 18:43

1 Answer 1

6
$\begingroup$

If you know that the distribution is in fact normal, then tests derived from normal data will be optimal. The Z-test (with known variance) achieves just such a property for the 1 parameter normal distribution.

Parametric tests can be derived for any old distribution with maximum likelihood. If data are Poisson, Exponential, etc., a likelihood ratio test with 1 degree of freedom can be done as a two sample test. The link between T-tests and regression with adjustment for a binary group variable can be extended to generalized linear models for two-sample tests with known but not normal data.

It's way more interesting to think about when we don't know the distribution of the data. I mean, if you don't know the mean, what sense is to say, "I know this is a 3 parameter bimodal normal mixture model!"

The T-test has some interesting properties that it is also efficient even for a general class of finite-variance distributions. This is because of the central limit theorem. The sampling distribution of the mean converges to normal even in very small samples. Another way of describing the T-test is an asymptotic test because you are approximating the long-run sampling distribution of the mean.

Some test statistics, especially mins and maxes, do not converge to normal distributions, so tests based on their limiting distributions are actually compared to exponential (Huzurbazar), Gumbell, or extreme value distributions as $n \rightarrow \infty$.

In general, we would never apply a parametric test to the data of the wrong parametric form, it's just obviously the less optimal solution.

$\endgroup$
8
  • $\begingroup$ So essentially, you are saying, that a) they do exist they just don't specifically have a name and b) if the data is not normal, usually the distribution is not something I know anyway, and c) if it kind of is normal, than the t-test will be okay anyway assuming I have enough data? $\endgroup$
    – fbence
    Commented Sep 24, 2018 at 18:54
  • 2
    $\begingroup$ @fbence a) some are named, e.g. Pearson chi-square test, but certainly not for all cases b) we rarely know any distribution in the wild, but the context/science gives us some background, like I know the distribution of neutrophils is well modeled by Negative binomial because it is a concentration. c) Usually yes. The normal approximation converges fast, and even faster when the sample(s) are unimodel, symmetric, concave, and normokurtic. $\endgroup$
    – AdamO
    Commented Sep 24, 2018 at 19:13
  • $\begingroup$ @AdamO I am curious as to what you mean by "efficiency" of statistical tests. There are several definitions; which one are you using to state that the t-test is "efficient even for a general class of finite-variance distributions"? $\endgroup$ Commented Sep 24, 2018 at 23:17
  • $\begingroup$ @PeterWestfall Asymptotically, the test is equivalent to the likelihood ratio test which is uniformly most powerful against alternatives. $\endgroup$
    – AdamO
    Commented Sep 25, 2018 at 14:41
  • $\begingroup$ @AdamO The LRT in some sense is powerful (but not UMP) for alternatives in the given family of distributions used to derive the LRT. But it is certainly not best in a power sense for other families of distributions. $\endgroup$ Commented Sep 26, 2018 at 12:19

Not the answer you're looking for? Browse other questions tagged or ask your own question.