2
$\begingroup$

I am working on a project with a small sample size where I have multiple predictors at baseline and one IV. I am trying to see if any of the DVs are good predictors for the score on the IV (continuous). The idea is to use many linear regression models with one DV each. We just want to see if any predictor has potential so it's quite exploratory but based on some theory we narrowed variables down to a few.

Since I am doing separate models (each with its own hypothesis technically) do I need to correct for multiple testing? Why or why not?

$\endgroup$
1
  • 1
    $\begingroup$ More important is that you clearly convey the exploratory nature of the study (and understand the impact of the inflated false positive rate). The real issue begins when what set out to be an exploratory study is later concealed as a confirmatory one. $\endgroup$ Commented Jul 2 at 14:55

2 Answers 2

0
$\begingroup$

That's what happens in the classic XKCD about jellybeans.

xkcd

The scientists test if one color predicts acne and find nothing. Then they do another test for another color and find nothing. Then they do another test for another color and find nothing. Then they do another test for another green and find a link. Then they do another test for another color and find nothing. Then they do another test for another color and find nothing.

Depending on how low the p-value is for green jellybeans, a Bonferroni correction could have helped these scientists from perhaps erronously concluding that green jellybeans are linked to acne.

It seems that you are doing the same as the scientists in this cartoon, and I would suggest caution in doing so and to control for multiple tests. Bonferroni is not the only option, however, and others can be more powerful (in the formal sense of statistical power).

$\endgroup$
0
$\begingroup$

Regarding corrections for multiple comparison: A long time ago, Jacob Cohen wrote that "this is a question on which reasonable people can differ."

Some will say that corrections are never needed. But, once you say they are needed sometimes, you have to ask what the number of studies is: All the tests in one hypothesis? In one set of regressions? In one paper? Across mulitple papers? For every study you do in your life as a scientist?

And you should also recall that lowering type 1 error increases type 2 error. Sometimes type 1 error is worse, but sometimes type 2 error is worse.

Furthermore, what is a "good" predictor? Whatever it is, it's not the same as a statistically significant one. E.g. if you have 1,000,000 observations, a very small effect will be significant but may not be good.

Going a little bit Bayesian, you have to consider how much hypothesis is correct.

Finally, and more generally, I don't think there is one rule that fits all situations. You have to think about it, and you have to be prepared to justify your decision.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.