3
$\begingroup$

Initially, I wanted to do a two-factorial (4x3) repeated measures ANOVA in order to analyse my data. To be more precise, I do have two factors with factor1 having 4 levels (A, B, C, D), factor2 having 3 levels (t1, t2, t3). In addition, it is a repeated measures design as all 28 subjects (s) are confronted with all the 4 levels (A, B, C, D) within the 3 timepoints (t1, t2, t3). However, it turned out, that according to Shapiro Wilk test my data are not normally distributed. That is why now I wanted to use the aligned rank transform ANOVA as a non-parametric alternative for a two-factorial rmANOVA. Within this method, data are aligned before doing the ANOVA.

The question is: After alignment, do the aligned data need to be normally distributed before performing the ANOVA for a valid outcome? I am not sure about this, as the ART seems to be a non-parametric method, but still includes an ANOVA (with normally distributed data as a premise).

I am using the R-library "ARTool" and have come up with the following code:

library(ARTool)
aligned_data <- art(dependentVariable ~ factor1 * factor2 + Error(s), data=data)

Now does the following Shapiro Test need to tell me, that my data is normally distributed in order to continue and trust the results?

shapiro.test(residuals(aligned_data))

anova(aligned_data)

Thanks in advance and any help is appreciated.

$\endgroup$
1
  • 1
    $\begingroup$ No. ANOVA on ranks of nonnormal data have been used as a sort of nonparametric tests. Of course the ranks are not truly normally distributed, but they are small-ish integers distributed roughly symmetrically and usually with no fall outliers. $\endgroup$
    – BruceET
    Commented Aug 11, 2021 at 10:21

2 Answers 2

1
$\begingroup$

This does not answer your question but what about using a two-way repeated measures ANOVA on trimmed means instead? Does not require data normality and/or homogeneity of variances. In R you can do this with the function wwtrim from the WRS package.

Besides that the Shapiro-Wilk test will be often significant even if your data is normally distributed provided that you have a large enough sample size.

$\endgroup$
0
$\begingroup$

Suppose you have a one-factor ANOVA with five levels of the factor and 40 exponential replications per level. It is difficult to imagine that the F-statistic would have the anticipated F distribution on account of the right-skewed exponential data.

set.seed(811)
x1 = rexp(40, .10)
x2 = rexp(40, .12);  x3 = rexp(40, .12)
x4 = rexp(40, .15);  x5 = rexp(40, .15)
x = c(x1,x2,x3,x4,x5)
g = as.factor(rep(1:5, each=40))
boxplot(x ~ g, horizontal=T, col="skyblue2")

enter image description here

An 'oneway' test om R does not require equal variances in the five levels of the factor. This test may tolerate the exponential data better than would a standard ANOVA. It rejects the null hypothesis that all level means are equal with P-value about 3%. On account of the exponential data, I would doubt whether this P-value is accurate. Multiple comparisons might be done with Welch 2-sample t tests, with some protection (such as Bonferroni) against false discovery from multiple tests on the same data.

oneway.test(x ~ g)

        One-way analysis of means 
        (not assuming equal variances)

data:  x and g
F = 2.8842, num df = 4.000, denom df = 95.898, p-value = 0.0265

The rank-transformed data have more nearly equal variances and are more nearly normal. [Data at each level pass the Shapiro-Wilk normality test; not shown.] Thus a standard ANOVA might be used.

Because the usual assumptions for an ANOVA are more nearly met here, perhaps the P-value here is roughly accurate. Again here oneway.test rejects the null hypothesis. On account of the rank transformation, interpretations of effect sizes in ad hoc test must be made with care.

boxplot(rank(x) ~ g, horizontal=T, col="skyblue2")

enter image description here

oneway.test(rank(x)~g)

    One-way analysis of means 
    (not assuming equal variances)

data:  rank(x) and g
F = 2.8508, num df = 4.000, denom df = 97.323, p-value = 0.0278

Standard one-way ANOVA:

anova(lm(rank(x)~g))
Analysis of Variance Table

Response: rank(x)
            Df Sum Sq Mean Sq F value  Pr(>F)  
 g           4  36682  9170.5  2.8386 0.02554 *
 Residuals 195 629968  3230.6                  
 ---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
$\endgroup$
1
  • $\begingroup$ These methods are somewhat ad hoc. Consider using full models that make use only of ranks of Y, i.e., semiparametric ordinal models. Resources are here. $\endgroup$ Commented Aug 11, 2021 at 11:45

Not the answer you're looking for? Browse other questions tagged or ask your own question.