6
$\begingroup$

I'm in a debate with a coworker and I'm starting to wonder if I'm wrong but the internet is confusing me more.

We have continuous data $[0, \infty)$ that is retrospectively selected on individuals. The selection is non random. Our sample sizes are $\approx 1000$. Our data is heavily skewed towards the left with some strong bumps towards the tail.

My strategy is to look at the distribution of the data before statistical tests between two groups via histograms, q-q plots, and Shapiro Wilk test. If the data is approximately normal I use an appropriate test (t-test, ANOVA, Linear Regression etc). If not I use an appropriate non-parametric method (Mann-Whitney Test, Kruskal-Wallis, Bootstrap regression model).

My coworker doesn't look at the distribution if the sample size is >30 or >50 he automatically assumes it is normal and cites the central limit theorem for using the t-test or ANOVA.

They cite this paper: t-tests, non-parametric tests, and large studies—a paradox of statistical practice? and say that I'm over-using non parametric tests. My understanding is my method would tell me if it's appropriate to do a normal distribution though because I thought that for heavy skewed data the n to reach ~normal distribution was higher. I know given a large enough sample size it would eventually get there but especially for the smaller sample sizes isn't it better to check? To me it makes sense that since multiple tests show that the data isn't normal it's inappropriate to use normal distribution then. Also if needing a sample size of 30 was all you needed for assuming normality why is so much work done on other distributions in statistical software? Everything would be normal distribution or non parametric then. Why bother with binomial distributions or gamma distributions? However they keep sending me papers about central limit theorem and now I'm not so sure. Maybe I am wrong and I shouldn't bother checking these assumptions.

Who is right and why?

$\endgroup$
12
  • 7
    $\begingroup$ Show your coworker some examples, such as the (extreme) one discussed at stats.stackexchange.com/questions/69898. It clearly controverts the paper's overly general conclusion that "For studies with a large sample size, t-tests and their corresponding confidence intervals can and should be used even for heavily skewed data." This conclusion is based on (extremely) limited simulations and one case study and then justified with "the t-test is robust even to severely skewed data and should be used almost exclusively." That is, at best, a statistically naive statement. $\endgroup$
    – whuber
    Commented Oct 23, 2020 at 16:15
  • 4
    $\begingroup$ You seem to be making the mistake of thinking that the central limit theorem says that your data will converge to a normal distribution. That is false: stats.stackexchange.com/questions/473455/…. $\endgroup$
    – Dave
    Commented Oct 23, 2020 at 16:25
  • 3
    $\begingroup$ @sean Could you explain why "automatically assuming [a sample] is normal" when its size exceeds 50 is any any sense "essentially correct"?? Even supposing the colleague is really referring to the sampling distribution of a statistic, very strong assumptions (which are not in evidence here) are needed to support any such conclusion. $\endgroup$
    – whuber
    Commented Oct 23, 2020 at 19:05
  • 4
    $\begingroup$ More like NEVER. The CLT may lead to normally distributed $\bar X,$ but if data are not normal then $\bar X$ and $S$ cannot be independent, and the "t-statistic" cannot have a Student t dist'n. [Note: $\bar X$ and $S$ are independent only for normal data.] // If your co-worker is your boss, maybe best to tread softly; otherwise, maybe let him know his information is from uninformed sources. $\endgroup$
    – BruceET
    Commented Oct 24, 2020 at 6:02
  • 2
    $\begingroup$ @BruceET, you cannot have a t-distribution, but maybe you get something that is close to it. $\endgroup$ Commented Oct 24, 2020 at 13:59

2 Answers 2

3
$\begingroup$

My strategy is to look at the distribution of the data before statistical tests between two groups via histograms, q-q plots, and Shapiro Wilk test. If the data is approximately normal I use an appropriate test (t-test, ANOVA, Linear Regression etc). If not I use an appropriate non-parametric method (Mann-Whitney Test, Kruskal-Wallis, Bootstrap regression model).

What is 'approximately normal'? Do you need to pass a hypothesis test to be sufficiently approximate normal?

A problem is that those tests for normality are becoming more powerful (more likely to reject normality) when the sample size is increasing, and can even reject in the case of very small deviations. And ironically for larger sample sizes deviations from normality are less important.

My coworker doesn't look at the distribution if the sample is >30 or >50 he automatically assumes it is normal and cites the central limit theorem for using the t-test or ANOVA.

Can we ALWAYS assume normal distribution if n >30?

It is a bit strong to say 'always'. Also it is not correct to say that normality can be assumed (instead we can say that the impact of the deviation from normality can be negligible).

The problem that the article from Morten W Fagerland addresses is not whether the t-test works if n>30 (it does not work so well for n=30 which can also be seen in the graph, and it requires large numbers like their table which used sample size 1000). The problem is that a non-parametric test like Wilcoxon-Mann-Whitney (WMW) is not the right solution, and this is because WMW is answering a different question. The WMW test is not a test for equality of means or medians.

In the article it is not said to 'never' use WMW. Or to always use a t-test.

Is the WMW test a bad test? No, but it is not always an appropriate alternative to the t-test. The WMW test is most useful for the analysis of ordinal data and may also be used in smaller studies, under certain conditions, to compare means or medians.

Depending on the situation, a person might always use a t-test without analysing the normality, because of experience with distributions that might occur. Sure, one can think of examples/situations where t-tests in samples of 30 or 50 are a lot less powerful (too high p-values), but if you never deal with these examples then you can always use a t-test.


Something else.

If you have a sample size of 1000 then you might consider that not only the mean is important and you could look at more than just differences in means. In that case a WMW test is actually not a bad idea.

$\endgroup$
5
$\begingroup$

The data does NOT get closer to being normally distributed as the sample size grows.

Rather, the thing that gets closer to being normally distributed is the sample mean or the sample sum.

And if the population distribution is very skewed, then you may need far more than $30,$ and it it isn't then maybe $10$ would be enough.

$\endgroup$
1
  • 2
    $\begingroup$ It also matters how far out in the tail you want to go. As an extreme case, genome-wide association studies work with nominal per-test thresholds on the order of $10^{-8}$ and need substantially larger sample sizes for $t$ statistics to be close enough to Normal (ie, the tails of a $t$ get relatively heavier the further out you go) $\endgroup$ Commented Nov 17, 2021 at 23:51

Not the answer you're looking for? Browse other questions tagged or ask your own question.