15
$\begingroup$

I understand it's reasonable only to not reject the null hypothesis. But why can we accept the alternative hypothesis?

What's the difference?

$\endgroup$
16
  • 5
    $\begingroup$ Rejecting the null hypothesis could be written up as accepting the alternative. Many people would rather not say that. Many people would rather focus on confidence intervals! Or something Bayesian. $\endgroup$
    – Nick Cox
    Commented Aug 31, 2022 at 18:39
  • 6
    $\begingroup$ Absence of evidence is not evidence of absence! $\endgroup$
    – Ben
    Commented Aug 31, 2022 at 18:39
  • 4
    $\begingroup$ @Ben many people would disagree philosophy.stackexchange.com/questions/92546/… $\endgroup$
    – fblundun
    Commented Aug 31, 2022 at 21:57
  • 3
    $\begingroup$ I am sorry I don't know which answer to accept because I can't understand any of them. $\endgroup$
    – user900476
    Commented Sep 1, 2022 at 13:31
  • 3
    $\begingroup$ @user900476 You are in no way obliged to accept any answer as long as none is satisfactory. stackoverflow.com/help/someone-answers You may, however, consider clarifying what would define a better answer. $\endgroup$
    – Bernhard
    Commented Sep 2, 2022 at 9:25

7 Answers 7

15
$\begingroup$

I'll start with a quote for context and to point to a helpful resource that might have an answer for the OP. It's from V. Amrhein, S. Greenland, and B. McShane. Scientists rise up against statistical significance. Nature, 567:305–307, 2019. https://doi.org/10.1038/d41586-019-00857-9

We must learn to embrace uncertainty.

I understand it to mean that there is no need to state that we reject a hypothesis, accept a hypothesis, or don't reject a hypothesis to explain what we've learned from a statistical analysis. The accept/reject language implies certainty; statistics is better at quantifying uncertainty.

Note: I assume the question refers to making a binary reject/accept choice dictated by the significance (P ≤ 0.05) or non-significance (P > 0.05) of a p-value P.

The simplest way to understand hypothesis testing (NHST) — at least for me — is to keep in mind that p-values are probabilities about the data (not about the null and alternative hypotheses): Large p-value means that the data is consistent with the null hypothesis, small p-value means that the data is inconsistent with the null hypothesis. NHST doesn't tell us what hypothesis to reject and/or accept so that we have 100% certainty in our decision: hypothesis testing doesn't prove anything٭. The reason is that a p-value is computed by assuming the null hypothesis is true [3].

So rather than wondering if, on calculating P ≤ 0.05, it's correct to declare that you "reject the null hypothesis" (technically correct) or "accept the alternative hypothesis" (technically incorrect), don't make a reject/don't reject determination but report what you've learned from the data: report the p-value or, better yet, your estimate of the quantity of interest and its standard error or confidence interval.

٭ Probability ≠ proof. For illustration, see this story about a small p-value at CERN leading scientists to announce they might have discovered a brand new force of nature: New physics at the Large Hadron Collider? Scientists are excited, but it’s too soon to be sure. Includes a bonus explanation of p-values.

References

[1] S. Goodman. A dirty dozen: Twelve p-value misconceptions. Seminars in Hematology, 45(3):135–140, 2008. https://doi.org/10.1053/j.seminhematol.2008.04.003

All twelve misconceptions are important to study, understand and avoid. But Misconception #12 is particularly relevant to this question: It's not the case that A scientific conclusion or treatment policy should be based on whether or not the P value is significant.

Steven Goodman explains: "This misconception (...) is equivalent to saying that the magnitude of effect is not relevant, that only evidence relevant to a scientific conclusion is in the experiment at hand, and that both beliefs and actions flow directly from the statistical results."

[2] Using p-values to test a hypothesis in Improving Your Statistical Inferences by Daniël Lakens.

This is my favorite explanation of p-values, their history, theory and misapplications. Has lots of examples from the social sciences.

[3] What is the meaning of p values and t values in statistical tests?

$\endgroup$
14
  • 1
    $\begingroup$ @whuber When did p-values and frequentist statistics started conditioning on the data? If we want to condition on the data, then we goes Bayesian. But I'll edit with my answer with some pointers to papers. $\endgroup$
    – dipetkov
    Commented Aug 31, 2022 at 19:41
  • 2
    $\begingroup$ I did not mean "condition" in the sense of assuming a probability distribution for the parameters; only that we are, of course, not making decisions about the hypotheses in vacuo but are basing them on the data. What you write here appears to fly in the face of all the literature on hypothesis testing. Why, after all, would anyone even bother if it weren't for the prospect that the test could tell us something about the state of nature? $\endgroup$
    – whuber
    Commented Aug 31, 2022 at 19:45
  • 3
    $\begingroup$ I haven't complained about any imprecision. As you are gently hinting, I have been imprecise in these comments myself. My original concern what that your statements about interpreting NHSTs looked wrong. $\endgroup$
    – whuber
    Commented Aug 31, 2022 at 21:23
  • 3
    $\begingroup$ For continuous data the probability of observed what we observed is zero, so the P-value is the probability of observing something more extreme than our observed data. So the above answer needs to be more nuanced. $\endgroup$ Commented Sep 1, 2022 at 15:42
  • 2
    $\begingroup$ What does 'et al.' mean? Could you change that into simple English to make your answer less intimidating. $\endgroup$ Commented Sep 1, 2022 at 20:13
8
$\begingroup$

Say you have the hypothesis

"on stackexchange there is not yet an answer to my question"

When you randomly sample 1000 questions then you might find zero answers. Based on this, can you 'accept' the null hypothesis?


You can read about this among many older questions and answers, for instance:

Also check out the questions about two one-sided tests (TOST) which is about formulating the statement behind a null hypothesis in a way such that it can be a statement that you can potentially 'accept'.


More seriously, a problem with the question is that it is unclear. What does 'accept' actually mean?

And also, it is a loaded question. It asks for something that is not true. Like 'why is it that the earth is flat, but the moon is round?'.

There is no 'acceptance' of an alternative theory. Or at least, when we 'accept' some alternative hypothesis then either:

  • Hypothesis testing: the alternative theory is extremely broad and reads as 'something else than the null hypothesis is true'. Whatever this 'something else' means, that is left open. There is no 'acceptance' of a particular theory. See also: https://en.m.wikipedia.org/wiki/Falsifiability
  • Expression of significance: or 'acceptance' means that we observed an effect, and consider it as a 'significant' effect. There is no literal 'acceptance' of some theory/hypothesis here. There is just the consideration that we found that the data shows there is some effect and it is significantly different from a case when to there would be zero effect. Whether this means that the alternative theory should be accepted, that is not explicitly stated and should also not be assumed implicitly. The alternative hypothesis (related to the effect) works for the present data, but that is different from being accepted, (it just has not been rejected yet).
$\endgroup$
8
$\begingroup$

In addition to the answers given by highly experienced users here, I'd like to offer a less formal and hopefully more intuitive view.

Briefly, the "null hypothesis" is considered accepted, unless there is some compelling evidence to reject it in favour of an alternative.

It helps to look at it from the decision-making perspective. Tests---not only statistical---help us make decisions. Before performing the test, we have one course of action. After performing the test, we may either keep the course or change it, depending on the test result. The null hypothesis is the default course of action, given no or not enough information.

For example, imagine you are a flying an aeroplane. Without a reason to do otherwise, you'll probably fly it straight towards your destination. But, the whole time you'd be performing "tests", like checking your radar whether there is some unexpected obstacle on your path. If the radar shows no obstacle, you'll keep your course. This is the default decision, which you'd most likely make even if you had to fly without a radar. I mean, what else could you do? Wildly zigzag through the sky?

In this analogy, the null hypothesis is that there is no reason to change the course. You don't "accept" it as a result of the test, because it has already been accepted before you took a look at the radar. Only if you discover an obstacle, you'd reject it in favour of changing the course.

Or, as a more real-world example, imagine developing a new drug for a disease. The default status, before you perform any trials at all, is that the drug is not approved. You may run in vitro, in vivo, and clinical trials to prove that your drug is safe and helpful. If that fails, the drug remains "not approved". Again, there is nothing to "accept", or at least nothing with practical consequences. Only with compelling evidence of the drug's usefulness its status can change to "approved".

As you can see from the examples, which hypothesis is treated as "null" is somewhat subjective. For example, is "homeopathy works" null, or does it need evidence to be accepted? That depends on your prior beliefs and experience. If you grew up in a homeopathic home, you are likely to consider it to work by default and would't change your mind unless you see strong evidence against it (or maybe ever). But, this can get arbitrarily philosophical / psychological.

$\endgroup$
6
  • 1
    $\begingroup$ (+1: "answers given by highly experienced users" you aren't exacllty a spring chicken on this website yourself...) $\endgroup$
    – usεr11852
    Commented Sep 2, 2022 at 14:52
  • $\begingroup$ Your plane example seems to have little to do with hypothesis testing and much to do with estimation and optimization. Can you explain how rejecting the null hypothesis of "flying straight" points out the direction in which to fly instead? $\endgroup$
    – dipetkov
    Commented Sep 2, 2022 at 20:09
  • $\begingroup$ @dipetkov It doesn't, much like many statistical tests - think of a two-sided t-test. The direction then needs to be decided based on further information. But, if we failed to reject the null hypothesis, we wouldn't bother collecting further information. $\endgroup$
    – Igor F.
    Commented Sep 8, 2022 at 7:55
  • $\begingroup$ Hm. While in a plane in the middle of the sky? I'll probably bother collecting further information. By the way, I like your example because it illustrates (it seems to me) that usually we'd like to learn more than what NHST can give us. $\endgroup$
    – dipetkov
    Commented Sep 8, 2022 at 8:29
  • $\begingroup$ Furthermore, "wouldn't bother collecting further information" means that you've accepted the null hypothesis. Not rejecting the null hypothesis means that you acknowledge other hypotheses are still in play. That would correspond to "keep going straight for now while collecting further information". In practice the more natural formulation is to ask: "What's the optimal direction to be flying in right now?" $\endgroup$
    – dipetkov
    Commented Sep 8, 2022 at 9:54
5
$\begingroup$

We should not accept the research/alternative hypothesis.

The main value of a null hypothesis statistical test is to help the researcher adopt a degree of self-skepticism about their research hypothesis. The null hypothesis is the hypothesis we need to nullify in order to proceed with promulgation of our research hypothesis. It doesn't mean the alternative hypothesis is right, just that it hasn't failed a test - we have managed to get over a (usually fairly low) hurdle, nothing more. I view this a little like naive falsificationism - we can't prove a theory, only disprove it†, so all we can say is that a theory has survived an attempt to refute it. IIRC Popper says that the test "corroborates" a theory, but this is a long way short of showing it is true (or accepting it).

A good example of this is the classic XKCD cartoon (see this question):

enter image description here

Is it reasonable for the frequentist to "accept" the alternative hypothesis that the sun has gone nova? No!!! In this case, the most obvious reason is that the analysis doesn't consider the prior probabilities of the two hypotheses, which a frequentist would do by setting a much more stringent significance level. But also there may be explanations for the neutrinos that have nothing to do with the sun going nova (perhaps I have just come back from a visit to the Cretaceous to see the dinosaurs, and you've detected my return to this timeline). So rejecting the null hypothesis doesn't mean the alternative hypothesis is true.

A frequentist analysis fundamentally cannot assign a probability to the truth of a hypothesis, so it doesn't give much of a basis for accepting it. The "we reject the null hypothesis" is basically a an incantation in a ritual. It doesn't literally mean that we are discarding the null hypothesis as we are confident that it is false. It is just a convention that we proceed with the alternative hypothesis if we can "reject" the null hypothesis. There is no mathematical requirement that the null hypothesis is wrong. This isn't necessarily a bad thing, it is just best to take it as technical jargon and not read too much into the actual words.

Unfortunately the semantics of Null Hypothesis Statistical Tests are rather subtle, and often not a direct answer to the question we actually want to pose, so I would recommend just saying "we reject the null hypothesis" or "we fail to reject the null hypothesis" and leave it at that. Those that understand the semantics will draw the appropriate conclusion. Those who don't understand the semantics won't be mislead into thinking that the alternative hypothesis has been shown to be true (by accepting it).

† Sadly, we can't really disprove them either.

$\endgroup$
1
  • 1
    $\begingroup$ +1 I think this is the best answer. Hopefully the OP will revisit the question and let us know whether he/she/they understood it. $\endgroup$
    – dipetkov
    Commented Sep 7, 2022 at 7:22
4
$\begingroup$

The answer depends on whether you are using a pre-defined critical value (or p-value threshold like p<0.05) in a hypothesis test that yields a decision (a Neyman–Pearsonian hypothesis test), or whether you are using the magnitude of the actual p-value as an index of the evidence in the data (a [neo-]Fisherian significance test).

If you are doing a hypothesis test then you are working with a set of rules that grant you a pre-set confidence of long-run performance of the test procedure. The way that the rules can give confidence about long-run test performance is by specifying what decision applies depending on the data, and the decision relates to the acceptance or non-acceptance (yes, that is rejection as far as I am concerned) of the null hypothesis. Rejection of the statistical null hypothesis can be thought of as acceptance of another hypothesis, but that other hypothesis can be nothing more than a set of all not-the-null hypotheses that exist within the statistical model. Accepting that not-the-null hypothesis is not very informative and so it is not unreasonable to simply say that the test rejects the null but does not accept anything else.

There is (sometimes) a specific 'alternative' hypothesis specified for a hypothesis test: the hypothetical effect size plugged into the pre-experiment power analysis used to set the sample size. That 'alternative' hypothesis IS NOT tested by the hypothesis test and has very little meaning once the data are available.

If you are doing a significance test then a small p-value implies that the data are inconsistent with the statistical model's expectations regarding probable observations where the null hypothesis is true. The analyst can then use that evidence to make a scientific inference. The scientific inference might well include an interim rejection of the statistical null hypothesis and acceptance of a specific 'alternative' hypothesis of scientific interest. It depends on the information available and the scientific objectives, and it is a process that is very rarely considered in statistical instruction.

See this open access chapter for much more detail: https://link.springer.com/chapter/10.1007/164_2019_286

$\endgroup$
2
  • $\begingroup$ Thank you for the reference! Could you please clarify this: The "pre-set confidence of long-run performance" is assuming the null hypothesis is true, right? But if there is a possibility the null hypothesis is false, can we really say anything? $\endgroup$
    – Mankka
    Commented Sep 1, 2022 at 8:27
  • $\begingroup$ @Mankka The pre-set confidence is regarding the long run false positive error rate. Can we say anything? Well, within the Neyman–Pearsonian framework you cannot say anything about the particular hypothesis of concern because that framework deals with only the global error rates. The Fisherian significance test does say things about the particular experiment. That's the difference. $\endgroup$ Commented Sep 1, 2022 at 20:04
1
$\begingroup$

Within the Bayesian framework you can "accept the null hypothesis" in the sense that the posterior probability of a point null hypothesis can tend to one with increasing sample size. This requires that the null hypothesis is exactly true and that you're willing to represent this in your prior by a point mass. Lindley (1957, p. 188) gives two examples where this is arguably reasonable: testing for linkage in genetics, and testing someone for telepathic powers. In addition, your prior on the parameter of interest must be proper under the alternative hypothesis. See for example this answer.

$\endgroup$
1
$\begingroup$

"Absence of evidence is not evidence of absence." Carl Sagan.

The null hypothesis specifies no effect, that is absence of effect. You reject the null if the results are statistically significant, that is, when you have evidence for rejecting the null. If the results are not statistically significant, what you have is absence of evidence.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.