Problem formulation
I have 5 strains on which I administered treatment and repeated the experiment two times independently. So my data looks something like this:
strain <- factor(1:5)
ctrl <- c(1, 1, 1, 1, 1)
trt1 <- c(60.82812657, 6.175973947, 7.395597282, 203.6573398, 24.42014734)
trt2 <- c(758.3219468, 26.84658479, 43.21119828, 103.2501452, 9.557721081)
with the output
strain ctrl trt1 trt2
1 1 60.828127 758.321947
2 1 6.175974 26.846585
3 1 7.395597 43.211198
4 1 203.657340 103.250145
5 1 24.420147 9.557721
The values are normalised to the control (that's why the controls are always 1). I want put statistical significance to the question: "Does the treatment increase the measurement values?"
Problem: What is the most appropriate test to answer this question, given this specific structure of the experiment?
What I have tried so far
Ignoring the strain structure
The values don't seem to be normally distributed, so I thought a non-parametric test would be more appropriate. If I perform a Wilcoxon t-ttest individually for each treatment group
wilcox.test(ctrl, trt1)
wilcox.test(ctrl, trt2)
I get a p-value of 0.007495
each (together with an warning message: “cannot compute exact p-value with ties”).
Pooling the two treatments together increases the significance level as expected, i.e.
wilcox.test(ctrl, c(trt1, trt2))
yields a p-value of 0.002245
(together with a warning message: “cannot compute exact p-value with ties”).
Including the strain structure
What came as a surprise to me is that the paired version of this test is not statistically significant anymore, e.g. the following code
wilcox.test(ctrl, trt1, paired = TRUE)
wilcox.test(ctrl, trt2, paired = TRUE)
yields a p-value of 0.0625
each (without a warning message). Now I would like to pool the treatment groups together (similar to the case above) but still somehow retain the strain-structure of the experiment. But this is where I am stuck right now.
Anyone knows if I am on the right track, or how to include the categorical variable strain
as some sort of covariate into a non-parametric statistical model?
lm(value ~ treatment + strain, data = data_df)
and then looked at the p-value for thetreatment
coefficient. $\endgroup$