1
$\begingroup$

I'm trying to run a number of independent samples t-tests. I'm using the yuen function in the WRS2 package because my data are non-normal and ordinal. My sample size is 391, but the output of the yuen function gives df values of between 2 and 16.35 depending on the variable. I don't really understand what df means in this context, but it seems weirdly low. When I run Welch two-sample t-tests on the same data, I get very different results for some variables (p = 0.07 for Welch compared to p = 0.41 for Yuen)

Is there something wrong with what I'm doing? Is having such low degrees of freedom some kind of error? I've read all the information I can find about the package and can't find anything relevant.

I know I'm supposed to give sample code, but I'm pretty new to R and I haven't been able to figure out how. I've provided my output if that helps.

Call:
yuen(formula = Choose_Approach ~ Cause_B_BrainChemicals, data = data2)

Test statistic: 0.8556 (df = 12), p-value = 0.40901

Trimmed mean difference:  0.27143 
95 percent confidence interval:
-0.4198     0.9627 

Explanatory measure of effect size: 0.25
$\endgroup$

1 Answer 1

3
$\begingroup$

Without knowing what your data looks like, such as how many levels of Choose_Approach and Cause_B_BrainChemicals there are, it is difficult to say. The two tests use different weights for pooling variances - see page 35 of this document for Yuen's formula. Yuen uses the number of untrimmed observations in the denominator, so its degrees of freedom are always going to be smaller than the Welch test which uses the total.

However, I don't think either test is really appropriate for your data. The difference between Welch and Yuen is mainly that the latter winsorizes its input, by default it truncates the top and bottom 20% (combined 40%!) of values. I don't really agree with such practice in general, though the more critical point is that it does nothing to make your ordinal outcomes "more normal", if anything it'll do the opposite. Winsorization only "protects" against extreme values.

I suggest you either look into parametric techniques such as ordinal regression, or use non-parametric rank-based alternatives.

$\endgroup$
3
  • $\begingroup$ Thank you so much for answering, Pieter! Choose_Approach is nominal with three levels (Approach 1, Approach 2, and no preference) and Cause_B_BrainChemicals is a single item on a 5-point Likert scale. Thank you for the information about winsorizing being potentially a problem. I'll look into alternatives. I would really appreciate any help in figuring out if I've made an error somewhere with the Yuen test! I would prefer to be able to use that if possible, just because it's consistent with what I've been doing in the rest of this paper. $\endgroup$
    – Josie K
    Commented Oct 13, 2023 at 10:29
  • $\begingroup$ In the way you describe it, should Choose_Approach not be your grouping variable and Cause_B_BrainChemicals the ordinal outcome, i.e. they should be swapped in the formula? Either way, having 15 distinct levels in the data you would only expect 'on average' 26 observations for each combination, so I don't think there's anything necessarily wrong with the degrees of freedom calculation. I'm not sure how exactly this is handled in the code though since this should only be a 2-sample procedure (and not 3 or 5). $\endgroup$
    – PBulls
    Commented Oct 13, 2023 at 10:51
  • $\begingroup$ You're a life-saver! You're absolutely right, I was supposed to swap the order around. I now have 137.9 degrees of freedom, which seems much better. And yes, oops - Choose_Approach only has two levels, as I removed people who had no preference. So hopefully the test should at least be mostly working now. Thank you so much for your time and help! $\endgroup$
    – Josie K
    Commented Oct 13, 2023 at 11:19

Not the answer you're looking for? Browse other questions tagged or ask your own question.