0
$\begingroup$

I intend to compare differences between means of eight groups. The differences between some of the means are only visible when I plot (in a box plot) the log-transformed data. However, I am unsure as to whether I should also do the one-way ANOVA statistics on the log-transformed data then? My concern is that when I do one-way ANOVA on the log-transformed data it gives me different significance between groups as compared to when doing the statistics on raw data.

The data is bacterial counts from 12 animals in each group. The count data is normal distributed.

enter image description here

$\endgroup$
6
  • 2
    $\begingroup$ Analysing the raw data versus the logged data will often give you very different results. That is why we do it so no problem there. $\endgroup$
    – mdewey
    Commented Apr 4 at 12:46
  • 1
    $\begingroup$ Why do you say that the counts are normally distributed? The box plots don't seem consistent with that. Indeed if the data were normally distributed, using logarithmic scale would not be needed. $\endgroup$
    – Nick Cox
    Commented Apr 4 at 13:04
  • 1
    $\begingroup$ There is no inherent problem in plotting on a log scale while doing analysis on untransformed data. BUT I wonder why the label for the y axis here is the log of a ratio. And 1) I would label the y axis in raw data terms and 2) I would transform the raw data if it made sense, substantively and 3) I would check the assumptions for the model carefully and choose something like robust regrression if the assumptions are violated. $\endgroup$
    – Peter Flom
    Commented Apr 4 at 13:20
  • $\begingroup$ What is the 3-fold division within each group? Your dataset is not too large to post here. $\endgroup$
    – Nick Cox
    Commented Apr 4 at 16:25
  • 1
    $\begingroup$ This is a strange and seemingly arbitrary transformation, because it's not the log. See stats.stackexchange.com/a/576509/919 and stats.stackexchange.com/questions/30728 for commentary on this issue. Assuming "CFU" means colony-forming units, there's often an issue with right censoring, too (there's an upper limit to what can be counted). // What are all those dots and little whiskers doing sprinkled around? Although these look sort of like boxplots, they are not. $\endgroup$
    – whuber
    Commented Apr 4 at 18:25

1 Answer 1

1
$\begingroup$

Count data can't be "normally distributed" as count data (unlike a normally distributed variable) can't have values less than 0. Many of your counts have large values that might not pose a problem in practice, but it looks like many conditions have exactly 0 counts to which you added a value of 1 to avoid problems in taking the logarithm (and thus have y-axis values of 0). Your data seem unlikely to meet the usual normality "requirement" for ANOVA, which is normality within each treatment group and the same variance across groups. There can be some flexibility with that "requirement," but you have 0 variance for many groups and very large variances in others.

If you have count data, use a model type designed for count data. Such models typically have an underlying logarithmic "link" function (consistent with your plot, which I find to be OK) but work in ways that handle 0 values and account for changes in variance as a function of count values. Those include Poisson and negative binomial models. From your box plots it seems that you might want to consider a "zero-inflated" or a "hurdle" model; they model both having extra 0 counts and the actual number of counts. This page has links to some worked-through examples.

$\endgroup$
2
  • 1
    $\begingroup$ Agree. A log (y + 1) scale can be a device -- a necessary evil -- for visualization as otherwise the zeros can't be plotted with logarithms, but modelling can, and should, use devices that don't require transformation. (With your data a square root scale is also possible.) $\endgroup$
    – Nick Cox
    Commented Apr 4 at 17:52
  • $\begingroup$ Thank you so much for the useful answers everyone! I now see that ANOVA test is not appropriate for my data, so I will look into Poisson and negative binomial models instead. $\endgroup$ Commented Apr 4 at 18:09

Not the answer you're looking for? Browse other questions tagged or ask your own question.