0
$\begingroup$

Im using multiple linear models/regressions and wondering whats the best way to report a model in a scientific manuscript? I see so many versions. Should i go for the info in summary() or anova()? If I use anova() and get df and SSQ etc, categorical values only give one value, but if I run summary, then I get one estimate and p-value per level in the categorical variable. What is preferred in the stats community?

Im a biologist working with R.

$\endgroup$
6
  • $\begingroup$ What information do you want to learn from your regression? $\endgroup$
    – Dave
    Commented Jun 20, 2023 at 8:54
  • $\begingroup$ In my results, I report whether I find a postive or negative effect of a variable (positive or negative estimate), if the variable is found to have a significant p-value, and I report the p-value, along with the test statistic. But then i want to add a table of the full model, prior to backward-selection, to show full transparency. But since Im not a statistician I dont fully comprehend what is needed to show full transparency. If that makes sense? $\endgroup$ Commented Jun 20, 2023 at 9:01
  • $\begingroup$ $1)$ But what do you want to learn from your model? $//$ $2)$ Stepwise selection invalidates downstream inferences. Why are you using a stepwise procedure? $\endgroup$
    – Dave
    Commented Jun 20, 2023 at 9:07
  • $\begingroup$ 1) I want to learn if pesticide and temperature (my two variables in my experiment) has any effect on the biological endpoints im measuring (e.g. survival and growth, egg production) in a soil organism. Im not using the regressions to make predictions about effect in e.g. temperatures outsid of my range, but using the models to say if pesiticide or temperature explain variance in my data / affect the response. 2) Ive learned this methode from statistics course, and its the one I know. Im using it to remove variables that are not improving the model and to find the simplest one. $\endgroup$ Commented Jun 20, 2023 at 9:16
  • $\begingroup$ The trouble with using stepwise variable selection is that the usual calculations do not account for the selection having been performed. This is why you get falsely narrow confidence intervals and the other forms of invalid downstream inference that are discussed in the link I posted (which is on a Stata website but has nothing to do with Stata software). $\endgroup$
    – Dave
    Commented Jun 20, 2023 at 15:41

0

Browse other questions tagged or ask your own question.