Wilcoxon Signed rank test for heavily tied paired data

Question

We conducted an experiment to test the effect of a map annotated with the height of landscape features on participant's estimates of height relative to that recorded by a UAV. We asked them to record the height to the nearest metre before and after access to a map. In addition to analysing the difference between UAV and participant estimates to the nearest 1m, we categorised data into three height bands 0-25m, 25-200m and >200m. We anticipated that there might be difference before and after a map when looking at data to the nearest 1m but not when looking at height bands. The response variable is the difference between participant estimated height band and the UAV height band. So the difference could be 0, 1, 2, where pariticpants could be out by 0 height band, 1 height band etc.. As you might imagine this creates a lot of zeros and ties in the data and I'm unsure how to handle these in the context of a Wilcoxon paired analysis. I am aware of the different packages and methods for handling tied observations (Pratt vs Wilcoxon), I have read around as much as I could before considering posting this question but what I am unsure of is how many zeros or ties make the results unreliable? When I analyse my data, I get a significant difference in height band estimates before and after access to a map. The dataset is large but as mentioned there are a lot of ties.

To reproduce my data:

    diff_hband_abs <- c(rep(0, 1515), rep(1, 374), rep(2, 1))
    round_data <- rep(c(1, 2), each = 945)
    vp_data <- c(rep('N', 1296), rep('Y', 594))
    df <- data.frame(DiffHband_abs = diff_hband_abs, Round = round_data, VP = vp_data)

diff_hband_abs = the difference between the participant estimated height band and that of the UAV

round_data = the round of the experiment where participants did not have access to a map (round 1) and the round where they did have access to a map (round 2)

vp_data = we collected information of whether participants had ever conducted surveys of bird flight height before (if they had vp_data ='Y'; if not vp_data='N').

When running the different possible functions for wilcoxon-signed rank test for paired data (including the version in the coin package with Pratts method for handling ties) I get highly significant p-values and I'm not sure if I can trust them due to the nature of my data. I would be interested in any clarity on this analysis and suggestions for other analysis would also be very welcome. I have considered ordinal regression but my data violate the proportional odds assumption.

Many many thanks for your time and consideration.

Harvey Motulsky · Accepted Answer · 2024-03-06 22:58:17Z

1

I don't know how to answer, but these comments might help:

You need to account for the direction of the difference. Was round 2 higher or lower than round 1? The differences need to include the sign (positive or negative).
If you are (as it seems) recording the absolute value of those differences, the p-value won't be meaningful.
It would help to graph the data somehow, to see what's going on.

answered Mar 6 at 22:58

Harvey Motulsky

20.9k11 gold badges58 silver badges120 bronze badges

Add a comment |

jginestet · Accepted Answer · 2024-03-07 17:48:26Z

I will try to answer, and make a few comments on the Wilcoxon signed rank test (WSRt). You are looking at the before/after data (hence pairs) for the height estimates, by bands. So your data can take 5 possible values: -2,-1,0,1,2. But these are not "numbers", but rather ordinal symbols. So we can not treat them numerically. Hence perhaps your idea of using a WSRt? Now your null hypothesis appears to be that there is no significant change in the band estimate between before and after having seen the map.
I will now make some (hopefully not too) wild assumptions about the data. I would assume that most of the data is in the "0" bin (no change), and that the median of the data is in the "0" bin? It would help a great deal if you could share a graph -bar graph- of the data).
Now, a WSRt is a test of the pseudo-median, which is, for all practical purposes, impossible to interpret intuitively. If the data is symmetric, it can then become a test of the median (but it is very sensitive to departures from symmetry). In fact, it is probably better as a test of symmetry: just use it to test the sample against the observed median. and the p-value will tell you the probability of the data being symmetrical (around the median, which then, btw, becomes the mean as well) Now, if the data were not symmetrical (which may explain why you get significant WSRt results), that would be a strong indication that the map had an effect (people changed their estimates in a particular direction). However, if the data looks symetric "enough", it does not mean that there was no change of heighht estimate due to the map, but that the changes were "random" or rather symmetrical, i.e. an equal amount of subjects changed up by 1 or 2 than down by 1 or 2: which does not prove that the map had no effect, just that the average population effect is null. Wrt to ties (and yes, there would be a lot, because all your data is only in 5 bins), I would not worry about it too much. Most implementations of teh WSRt deal with it nicely (i.e. employ appropriate corrections for ties). And since you are testing symmetry, a graph will tell you if you believe the test result.

The above approach may help, but I feel it is too "coarse". What I mean by that is, for example, that you "lump" together going from 2 to 1, and going from 1 to 0 (and same for going up): you are losing information about the changes.

So let me propose a different approach. You are trying to compare the band assignment before, and after having seen the map. I would treat this as a 3x3 contingency matrix: columns are band assignments before, rows are after. For each subject, you add 1 at the intersection. And then you use a test such as Fisher exact, or Chi-square, for the 3x3 matrix (only use Chi-square if all the cells in teh matrix have "large" (>10?) values).Now you can tell if the changes affected only some cells, or only in some direction, etc... It is similar to what @Sextus Empiricus proposed, but there he was comparing UAV to the estimates, by height band, which does not appear to be what you wanted to do in the question (comparing before and after having seen the map)

Sextus Empiricus · Accepted Answer · 2024-03-07 10:13:37Z

Disclaimer: I am just making up the approach below. I have no idea whether there might be some rigorous description. But for your case this seems to be an approach.

The data that you represent is not the raw data. A better analyse would be based on a contingency table (but this requires your raw data and not just the differences, also your data doesn't indicate the sign, so below I just made up some numbers)

$$\begin{array}{c|c cccccccc} &\rlap{\qquad\text{ESTIMATES}}\\ &\text{0-25m} &\text{25-200m} &\text{>200m}\\ \hline \text{TRUE UAV} \\ \text{0-25m}& \color{gray}{147} & \color{blue}{202} & \color{blue}{53} & 402\\ \text{25-200m}& \color{red}{111} & \color{gray}{255} & \color{blue}{176} & 542\\ \text{>200m}& \color{red}{17} & \color{red}{210} & \color{gray}{319} & 546 \\ &275 & 667& 548 & \\ \end{array}$$

You can also view this as a categorical distribution

$$\begin{array}{cc} \text{category}&\text{observed cases}\\ \hline \text{true} - \text{estimate} = 2 & 17\\ \text{true} - \text{estimate} = 1 & 321\\ \text{true} - \text{estimate} = 0 & 721\\ \text{true} - \text{estimate} = -1 & 378\\ \text{true} - \text{estimate} = 2 & 53\\ \end{array}$$

The Wilcoxon signed rank statistic is the sum of deviations from zero expressed in terms of ranks or quantiles. It tests the hypothesis that the sum of ranks of negative deviations equals the sum of ranks of positive deviations. (and typically this is used to test the alternative that the distribution is shifted from zero)

An equivalent for this case could be to expres such hypothesis for the parameters of the categorical distribution $p_{-2},p_{-1},p_{0},p_{1},p_{2}$. The quantiles of the absolute differences will be $r_0=p_0$, $r_1 = p_0+p_{-1}+p_{1}$ and $r_2 = p_0+p_{-1}+p_{1}+p_{-2}+p_{2}$ and we would hypothesize that

$$r_1 p_{-1} + r_2 p_{-2} = r_1 p_{1} + r_2 p_{2}$$

or differently expressed

$$(p_{-2} - p_{2}) + (p_{-1} - p_{1}) - (p_{-2} + p_{2}) \cdot (p_{-1} - p_{1}) = 0$$

One could fit this model with maximum likelihood and compute with a chi-squared test the goodness of fit to obtain a p-value.

Stack Exchange Network

Wilcoxon Signed rank test for heavily tied paired data

3 Answers 3

Not the answer you're looking for? Browse other questions tagged
paired-data
wilcoxon-signed-rank
ties
or ask your own question.

Hot Network Questions

Wilcoxon Signed rank test for heavily tied paired data

3 Answers 3

Not the answer you're looking for? Browse other questions tagged paired-datawilcoxon-signed-rankties or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
paired-data
wilcoxon-signed-rank
ties
or ask your own question.