1
$\begingroup$

I conducted exploratory analysis of my data and got Mean and standard deviation values from the original data and plotted the data using boxplots.

I conducted regression models (GLM and GLMM) on the data (which allow to estimate marginal means) and I would like to know what I have to report, if the "raw" mean and SD or the values estimated by the model. Or if I should report both, for example the "raw" in a table and the estimated in a boxplot or vice-versa.

Thank you!

$\endgroup$
5
  • 2
    $\begingroup$ As is, the question is too broad to be answered correctly. Different fields (and different journals within those fields) have different formats. But, more importantly, what you have to report depends on what you found interesting. $\endgroup$
    – Peter Flom
    Commented Aug 19, 2013 at 10:56
  • $\begingroup$ Box plots are usually understood to show medians, quartiles, minimum and maximum, and often other details. They can be very useful, but they don't typically show means and standard deviations. $\endgroup$
    – Nick Cox
    Commented Aug 19, 2013 at 11:17
  • $\begingroup$ Thank you PeterFlom and NickCox. I am comparing estimates of infection between two species. Both the "raw" and estimated data seem to point to the same direction, showing differences between them. But I would like to know which should be reported, if the "raw" data so people can see what the data looks like (then there is the description of the statistics in the results that explain how these were estimated by the model). But like this no mention is given to the estimated marginal means, is that correct? $\endgroup$
    – Scientist
    Commented Aug 19, 2013 at 11:18
  • 1
    $\begingroup$ @Peter Flom's comment still applies; you don't seem to be taking it seriously. A question this unclear can't be fixed by extra comments, but needs rewriting with much more information. I can't follow what you are doing but in general reporting raw data summaries and model fits is often a very good idea. $\endgroup$
    – Nick Cox
    Commented Aug 19, 2013 at 11:38
  • 1
    $\begingroup$ Typically at least some aspects of the raw data are included and some aspects of the model. More than that .... well it varies. $\endgroup$
    – Peter Flom
    Commented Aug 19, 2013 at 11:46

1 Answer 1

3
$\begingroup$

Summary statistics of the raw data should be reported by default. Whether it is in tabular or graphical form depends on the complexity of the data and the pricing of printing each table and illustration. In your case, if you have already reported mean and SD, and the box-plot really didn't show anything unique (e.g. no skewness, no visible outliers, etc.), then I'd give that box-plot a lower priority to be included. If the box-plot indeed points out something that would render the mean and SD not as representative, then you may consider including both, or better yet, revisiting the possibility of replacing mean and SD with other more robust summary statistics.

Tabulating/visualizing the results of regressions is a good idea, as long as the information is not verbatim included in the text. It'd also be important to closely incorporate the table/illustration with the Results and Discussion section.

In a nut shell, clarity and efficiency always come first. Tradition/convention varies journal by journal and reviewer by reviewer and it's futile to look for a golden formula.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.