1
$\begingroup$

The Wilcoxon signed-rank test is generally used for non-parametric data (i.e. not normally distributed). When the sample gets large, the data will be approximately normally distributed. Therefore there is no need to use the Wilcoxon signed-rank test, and a parametric test would be preferred.

Considering this, the Wilcoxon signed-rank test would be most appropriate where we cannot get a large sample, and the sample is not normally distributed. (Please correct me if I am wrong).

Could you suggest to me a couple of business use cases for the use of the Wilcoxon signed-rank test?

$\endgroup$
7
  • 2
    $\begingroup$ You have a common misconception about the central limit theorem. A question of mine has several nice answers explaining why it is false. $\endgroup$
    – Dave
    Commented Sep 22, 2022 at 15:59
  • 1
    $\begingroup$ Your main point " the Wilcoxon signed-rank test would be most appropriate where we cannot get a large sample, and the sample is not normally distributed" is absolutely correct. It is also useful when we have large samples and the data is very not normally distributed. The first use case which jumps out at me would be A/B testing. $\endgroup$ Commented Sep 22, 2022 at 16:00
  • $\begingroup$ @JohnMadden the reason I said that it is most appropriate in instances where we cannot get a large sample is because of the central limit theorem which I seem to have misunderstood according to the other comments. What is the actual reason for that? $\endgroup$
    – S. Tiss
    Commented Sep 22, 2022 at 16:04
  • $\begingroup$ @S.Tiss Nah it is indeed because of the CLT; your misunderstanding (or perhaps simply a misstatement given your comment on Eoin's post, which my colleagues here seemed maybe just a little bit too excited to jump on?) does not affect the validity of your conclusion in this case (though it's true the reasoning you used is slightly off). $\endgroup$ Commented Sep 22, 2022 at 16:18
  • 1
    $\begingroup$ 1. Data are neither parametric nor nonparametric $-$ data don't have parameters at all, models do. Models can be parametric or not, and the term 'nonparametric' does not mean 'not normally distributed'. There's an infinite variety of models that are parametric but not normal. 2. "When the sample gets large, the data will be approximately normally distributed" ... no. The sample size doesn't change the distribution you're sampling from, and it's that distribution you need to worry about for assumptions. The empirical distribution of data (sampled at random) will eventually approach the ...ctd $\endgroup$
    – Glen_b
    Commented Sep 22, 2022 at 23:18

4 Answers 4

5
$\begingroup$

"When the sample gets large, the data will be approximately normally distributed."

This is absolutely not true, so the rest of the question is based on a false premise. There is no reason to expect a large sample up be any closer to Normal than a small one. Perhaps you've misunderstood the central limit theorem?

$\endgroup$
1
  • 4
    $\begingroup$ It should be "distribution of sample means approximates a normal distribution as the sample size gets larger", right? $\endgroup$
    – S. Tiss
    Commented Sep 22, 2022 at 15:57
1
$\begingroup$

The gist of (good) nonparametric tests is that they are almost (but not quite) as good as parametric tests when the parametric assumptions are met, but they can blow away parametric approaches when the parametric assumptions are false.

Yes, t-tests have good robustness to deviations from normality, but this has to do with the type I error rate. The power can be quite poor, especially if you badly deviate from normality; for instance, log-normal distributions can have rather slow convergence, despite meeting the assumptions of the central limit theorem.

I’ll close with a link of possible interest.

$\endgroup$
1
$\begingroup$
  1. The OP was a bit "loose" with words when writing "When the sample gets large, the data will be approximately normally distributed." What should have been said is that "When the sample gets large, the sampling distribution of the means of the samples will be approximately normally distributed." When using a t-test e.g., we do not care about the original distribution being normal, we care about the sampling distribution of the statiustic of interest being normal... So, yes, with large samples, parametric tests of the mean are more valid/powerful than any non-parametric test. Several commentators jumped on this a bit too fast, as the basic question remains valid (is there any valid practical use for the Wilcoxon Signed Rank test?)
  2. The Wilcoxon signed rank test (WSRt from now on) does not test the mean, or the median, as is regretably too often written. The WSRt instead tests the pseudo-median (https://en.wikipedia.org/wiki/Pseudomedian). When the sample is symetric, then the pseudo-median is equal to the median (and the mean). But samples are NEVER absolutely symetric, so... The best wording for the null of the WSRt that I could find is that "the data is symmetrical around the hypothesized target".
  3. The problem with that null is that, when you get a significant result, you have no idea what it means. It could be because the data was not symmetrical, because it was symmetrical but around a median different enough from the hypothesized one, or (basically all the time: all sample data will have some non-symmetry) because of a combination of both.
  4. Note that when the data is symmetrical "enough", then a) mean=median b) CLT converges very fast because your sample is not very "non-normal".
  5. If data is "very non-normal", then the data is also "very non-symmetrical". So WSRt is not applicable.
  6. Everywhere you could use a WSRt, you could use a Sign test (which does not rely on any assumption: and if the sample is large it will have power). Or even a Kruskal-Wallis U test (KWUt), even for a single sample (construct an artificial sample of the same size, with all values equal to your hypothesized target). And at least there is an intuitive interpretation of significance of the Sign test (median) or KWUt (stochastic dominance).

So, after all this, I have reached the conclusion that the WSRt is basically useless, at least in applied statistics (it may have some use in mathematical/theoretical statistics), because the pseudo-median has no intuitive interpretation, the symmetry assumption is untestable and the test is not robust to departures from symmetry, while a t-test is robust (with respect to type 1 errors) to departures from normality, and has intuitive interpretations for significance. From all I can tell, it is, at best, a historical relic.

And based on the answers in your post, no-one (yet?) presented a possible business case for the WSRT. QFD...

$\endgroup$
0
$\begingroup$

One example application of the Wilcoxon signed-rank test is for comparing the performance of two classifiers across multiple datasets. See e.g. Demšar, "Statistical Comparisons of Classifiers over Multiple Data Sets" JMLR 2006.

It is argued this is better than the (often-used) t-test because the t-test assumes a normal distribution and commensurate differences in performance across multiple datasets.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.