0
$\begingroup$

I submitted the following table for peer review in a manuscript. The reviewer comments showed that the table confused them. In our study we have a pre- and post-score that we measured as DVs. Our predictors included 2 dummy variables, a "Theme" factor variable that included 5 levels, GEF, XYZ, ABC, DEF, and MNO. We had two continuous variables, variable 1 and 2, at level one. And, we had two continuous variables, variable 3 and 4, at level two.

The reviewers complained that our regression table mysteriously dropped a theme. This is because the way R reports factors is by using one theme as a reference level and then including the other levels in the regression output. That is what I reflected in the table with a note that says that "Theme GEF is the reference theme."

I'm looking for suggestions on how to better format this table so future reviewers will understand that there is a reference theme and that is why it doesn't show up in the table. I am considering converting the 5 themes into separate dummy variables, as that would cause them to all show up in the table. But, that is more work and defeats the purpose of having factors. Any suggestions are appreciated.

(I've simplified the variable names so that this question doesn't rely on knowledge of our domain.)

enter image description here

$\endgroup$

1 Answer 1

1
$\begingroup$

These tables are often confusing, as you can tell from many questions on this site. Unless you have some experience with the treatment/dummy coding used by R as default, thus appreciating that the intercept is the estimated outcome when all predictors are at reference or 0 levels and that regression coefficients represent differences in outcome from what's predicted by the intercept and lower-level coefficients, it's very easy to get lost.

That's compounded here by the presentation of coefficients for 4 individual levels of a 5-level categorical Theme predictor. Yes, that's the way that R reports results. But the values and apparent "significance" of each of those coefficients in the display depend on your choice of reference level--each coefficient represents the difference of that particular level from your choice of reference level. Your audience would better be served by a measure of whether there are any differences associated with Theme, for example by a likelihood-ratio test on models including versus excluding Theme or a Wald test on all the associated coefficients in this model.

At the least, remind the reader exactly what the Intercept represents: here, not only the reference Theme but also 0 levels of Dummy A and Dummy B and of the continuous predictors. You could spell that out explicitly instead of just saying "Intercept." Also say in a footnote that the coefficients are the differences from the Intercept value in this case.

More intuitively, use a post-regression tool to display estimates of actual outcome values and errors for illustrative combinations of predictor values. The emmeans package is one popular choice. Focus your attention on those illustrations, with the coefficient table included in your report as necessary background.

One final thought: was there some reason why you did completely different models for pre and post observations? One might more typically do a combined model with pre/post as a predictor and interactions of that predictor with the other predictors to evaluate changes in associations with pre/post.

$\endgroup$
1
  • $\begingroup$ Thanks for the thoughtful feedback. You've given me some ideas to play with. Do you think it would be worthwhile to convert the 5 themes into dummy variables before running the regression to make them more clear for reviewers? (R is not the stats lingua franca of my field.) Regarding interacting the pre/post scores in one regression model, this is originally how I had it set up but it was challenging to interpret and adding many more rows in the model. One of my advisors suggested breaking it into two models to make it more digestible. $\endgroup$
    – Kevin T
    Commented Jul 22, 2022 at 15:27

Not the answer you're looking for? Browse other questions tagged or ask your own question.