1
$\begingroup$

I am willing run the wilcoxon rank sum test for different combination of variables, and using the base R code, I get a lot of warnings. The reason is that there are so many (about 30% of the whole data) ties. What is the approach to handle ties, when using wilcoxon ranksum test?

$\endgroup$

1 Answer 1

2
$\begingroup$

How to handle ties depends on the implementation of the two-sample Wilcoxon test in your software and on the nature of your data. [I will use R.]

Consider the following fictitious data:

set.seed(1234)
x1 = round(rnorm(20, 60, 7))
x2 = round(rnorm(20, 65, 7))

length(unique(x1))
[1] 15  # five ties among the x1
length(unique(x2))
[1] 15  # five more among the x2

I have created a difficulty with ties by rounding the normal data to integers.

wilcox.test(x1,x2)

    Wilcoxon rank sum test with continuity correction

data:  x1 and x2
W = 138.5, p-value = 0.098
alternative hypothesis: 
 true location shift is not equal to 0

Warning message:
In wilcox.test.default(x1, x2) : 
 cannot compute exact p-value with ties

Sensitivity to ties is especially serious for small samples such as these.

For nearly normal data, there is no difficulty using a t test instead a Wilcoxon rank sum test. While severely rounded normal data are not exactly normal, a two-sample t test can be used instead of the Wilcoxon test to see if the two samples come from populations with the same centers. The small P-value indicate that the centers (means) are not equal.

t.test(x1, x2)

        Welch Two Sample t-test

data:  x1 and x2
t = -3.1939, df = 23.885, p-value = 0.003913
alternative hypothesis: 
 true difference in means is not equal to 0
95 percent confidence interval:
 -8.478752 -1.821248
sample estimates:
mean of x mean of y 
    60.20     65.35

The next samples are also rounded, and so they also have ties, but the implementation of the two-sample Wilcoxon test in R uses an approximation that is not much troubled by ties in larger samples.

set.seed(1235)
y1 = round(rnorm(200, 60, 15))
y2 = round(rnorm(200, 70, 15))
wilcox.test(y1,y2)

        Wilcoxon rank sum test 
        with continuity correction

data:  y1 and y2
W = 13173, p-value = 3.501e-09
alternative hypothesis: 
 true location shift is not equal to 0

Returning to the smaller datasets x1 and x1 above, we see that the two samples are of similar shape and and approximately normal, so we might try a Welch 2-sample t test. It shows significant means at the 10% level, but not at the 5% level.

x = c(x1, x2)
g = rep(1:2, each=20)
boxplot(x~g, horizontal=T, col="skyblue2")

enter image description here

t.test(rank(x)~g)

        Welch Two Sample t-test

data:  rank(x) by g
t = -1.7088, df = 37.106, p-value = 0.09585
alternative hypothesis: 
 true difference in means is not equal to 0
95 percent confidence interval:
 -13.441719   1.141719
sample estimates:
 mean in group 1 mean in group 2 
          17.425          23.575 

More generally, we can do a permutation test. Although we do not want to assume that the Welch t statistic has a t distribution, we are willing to use it as a reasonable way to express the difference between the two sample means relative to the sample variances, so we use the Welch t statistic as the 'metric' for a permutation test. This is a reasonable choice because the two samples have similar shapes.

We approximate the distribution of $T$ by repeatedly randomly scrambling the 40 observations into two groups of 20 and finding the value of $T$ for each such permutation. From the resulting approximated permutation distribution of $T,$ we can see that x1 and x2 have different locations---just barely at the 5% level.

set.seed(123)
pv = replicate(2000, t.test(x~sample(g))$p.val)
mean(pv <= 0.05)
[1] 0.0495

Notes: (a) In rhe R code sample(g) randomly assigns each of the 20 observations to one of the two groups.

(b) In the permutation test, I used the P-value from each of the 2000 permutations of the data for consistency because the Welch T statistic can have different degrees of freedom for each permuted sample.

(c) An informal way of seeing whether it is worthwhile going beyond the Wilcoxon test with ties, would be to use jittering (to 'un-round' the data). This can be done by introducing a small amount of randomness to break ties. Some resulting P-value will be too large and some too small, but you can get a rough idea whether locations differ. With such unofficial snooping we could have known the the 'guessed' P-value (around 10%) is not far off.

xj1 = x1 + runif(20, -0.1, 0.1)
xj2 = x2 + runif(20, -0.1, 0.1)
wilcox.test (xj1, xj2)$p.val
[1] 0.1206997

xj1 = x1 + runif(20, -0.1, 0.1)
xj2 = x2 + runif(20, -0.1, 0.1)
wilcox.test (xj1, xj2)$p.val
[1] 0.09649955

xj1 = x1 + runif(20, -0.1, 0.1)
xj2 = x2 + runif(20, -0.1, 0.1)
wilcox.test (xj1, xj2)$p.val
[1] 0.09108615

Of course, I could have chosen 'nicer' data from the start, but there is no obvious benefit in giving the false impression that all datasets are 'nice'.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.