2
$\begingroup$

I am running some t-tests using data from small samples (n=25) whose distributions do not appear to be normally distributed (they are not extremely skewed, but the skew is apparent in the plots and they fail the Shapiro-Wilks test for normality).

When I run a permutation test (to test for the differences being zero), I noticed the p-values are all higher than when I run the 2 sample t-tests. Is this always true? And if so, why?

Thanks,

Matt Brenneman

$\endgroup$
4
  • 4
    $\begingroup$ What statistic are you using for the permutation test? $\endgroup$
    – whuber
    Commented Feb 10, 2022 at 19:35
  • $\begingroup$ Are you using the same number of tails in both tests? t-tests are not much affected by modest non-normality of the data and it is likely that n=25 is large enough to make the t-test perform correctly. $\endgroup$ Commented Feb 10, 2022 at 20:29
  • 1
    $\begingroup$ @whuber I am doing a two-sample (unpaired) t-test with n=25 for a number of different distributions. My statistic in the t-test is the standardized sample mean under the null that the true difference in means is zero, having (n-1) df. My one concern is that the scores I am working with are themselves paired differences: they are differences in test scores in a test-retest scenario involving a treatment and control group. My concern is that because I am analyzing the differences of the scores (between the two groups), this might affect my degrees of freedom. $\endgroup$ Commented Feb 12, 2022 at 18:13
  • $\begingroup$ @MichaelLew Because I am using a one-tailed test for the permutation test, and I have to keep the comparison "apples to apples", I am using an upper tailed test with the t test. $\endgroup$ Commented Feb 12, 2022 at 18:16

1 Answer 1

3
$\begingroup$

Here are some fictitious data for illustration:

set.seed(2022)
x1 = rgamma(25, 2, 1/2)
x2 = rgamma(25, 2, 1/5)

Boxplots show some skewness and seem to show different locations.

boxplot(x1, x2, horizontal=T, col="skyblue2", pch=20)

enter image description here

Also, a Shapiro-Wilk test finds the first sample to be non-normal.

shapiro.test(x1)$p.val
[1] 0.02834856
shapiro.test(x2)$p.val
[1] 0.07122469

I would not choose a t test to look for significantly different locations--especially not a pooled t test, which assumes equal variances. But it does find a highly significant difference between the two samples.

t.test(x1, x2, var.eq=T)

        Two Sample t-test

data:  x1 and x2
t = -4.0511, df = 48, p-value = 0.0001854
alternative hypothesis: 
 true difference in means is not equal to 0
95 percent confidence interval:
 -7.345875 -2.472699
sample estimates:
mean of x mean of y 
 4.396683  9.305970 

If we doubt that the t statistic has anything like Student's t distribution with DF = 48, we might be wary of quoting the P-value above as meaningful.

However, we can use the pooled t-test as a 'metric'; that is, a measure of different population means based on the two samples. Then we can scramble the 50 observations repeatedly into two 'samples' of 25, and find the pooled t statistic for each scrambling. That will give us an idea of the actual permutation distribution of the pooled t statistic. [In R sample(g) does the scrambling.]

set.seed(1234)
x = c(x1,x2);  g = rep(1:2, each=25)
t = replicate(10^5, t.test(x~sample(g), var.eq=T)$stat)
mean(abs(t) >= 4.0511)
[1] 0.00012  # aprx P-val of permutation test

The P-value of the permutation test is near $0$ so the the two sample means are clearly significantly different. (But this P-value is not as small as the P-value from the pooled t test--of which we were suspicious.)

A histogram of the simulated t values is shown below along with vertical lines at the t statistic from the pooled t test (and its negative for a 2-sided test), and the density function of $\mathsf{T}(\nu=48),$ for comparison. [Agreement in the central part of the distributions is good, but we have seen that the tails are not the same.]

hdr = "Simulated Permutation Dist'n"
hist(t, prob=T, col="skyblue2", main=hdr)
 abline(v = c(-4.05, 4.05), lwd=2, lty="dotted",col="darkgreen") 
 curve(dt(x,48), add=T, lwd=2, col="orange")

enter image description here

Note: @whuber's question in Comments is crucial. You do not say what metric you used for your permutation test.

Different metrics will give different P-values, so I can't say whether your results are correct. For my fictitious data and my metric the P-value of my permutation test is also larger than the P-value of the pooled t test.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.