0
$\begingroup$

I'm trying to solve a question from an introductory textbook on statistics. I am to use the Sign Test and determine if there is significant evidence that the median of a dataset $M$ is "at least" 20 (which I understand to be explicitly $\geq 20$ as to include $20$), and I need to calculate the corresponding P-value.

So far, the textbook has worked with null-hypotheses in the form of equalities (as in $H_0: M = 20$ vs $H_A: M > 20$ for example). But I have understood that this equality is not exact and in the previous example would be equivalent to something along the lines of $H_0: M \leq 20$ vs $H_A: M > 20$, where (paraphrased in my own understanding) the "worst case" occurs at $20$ and that is why we compute $M=20$.

Using that, and noting that for the dataset I have $n = 10$ and $S_{obs} = 9$ (the critical value), and thinking that I must include "$=$" in the null-hypothesis, I defined my hypotheses as:

$H_0: M = 20$ (aka $H_0: M \geq 20$) vs $H_A: M < 20$.

For this I got the p-value $0.9990234$ (as in $p = P\{S \leq 9\}$ being computed using pbinom(9, 10, 0.5) in R). I understand this is significant evidence to accept $H_0$ ("there is no evidence against the median being 20 or greater than 20", which I took to be equivalent to something like "according to the evidence we do have, the median is $\geq$ 20").

But the solution key provides the p-value as $p = 0.0107$. The key also gives the solution for this as being given by SIGN.test(data21a, md=20, alternative="greater"), which I understand would entail having hypotheses $H_0: M \leq 20$ vs $H_A: M > 20$. And in that case the p-value would be $p = P\{S \geq 9\} = 1 - P\{S \leq 8\}$ or 1 - pbinom(8, 10, 0.5).

I'm quite confused regarding this. Can you generally ignore the equality case in a median Sign Test of $\geq$? Shouldn't the results be equal to the inverse probability of each other or is this a bad interpretation?

EDIT: I'd like to add that the original problem in the textbook is about somebody needing an internet provider that can deliver a median speed of "at least" 20 Mbps. So I understood the only undesirable case is if the sample provides evidence that "they are not able to provide a median speed of at least 20 Mbps". The median speed being exactly 20 seems perfectly acceptable.

EDIT 2: I also thought about it more and I realise that the p-value equation is basically the standard Binomal probability of "x successes over n trials". I computed the critical value $S_{obs}=9$ per the textbook by counting how many sample elements are $> 20$. When I compute the right-tail probability $P\{S \geq S_{obs}\}$ I understand that I am computing the "probability of seeing $\geq S_{obs}$ values that are $>20$ over $10$ trials". So I imagine you could adapt the problem to take a critical value $S_{obs}'$ as "the number of items $\geq 20$" and compute the "probability of seeing values that are $\geq 20$ over $10$ trials", which would then be the desired p-value for this question?

$\endgroup$

1 Answer 1

1
$\begingroup$

the original problem in the textbook is about somebody needing an internet provider that can deliver a median speed of "at least" 20 Mbps

suggests to me that $H_0$ should be $M \ge 20$ and $H_A$ should be $M < 20$, rather than $H_0: M \le 20$ and $H_A:M>20$. But this does not really matter in the context of your question about a continuous distribution and its relationship one sided test based on a symmetric binomial random variable.

If you test $10$ times, the probability under $H_0$ that all $10$ tests are the "wrong" side of $20$ is $\le \frac{1}{2^{10}} = 0.0009765625$ (your $0.9990234$ is the complement of this), while the probability exactly $9$ tests are the "wrong" side of $20$ is $\le \frac{10}{2^{10}} = 0.009765625$. The $\le$ turn into $=$ when $M=20$.

So if you test $10$ times and observe $9$ tests the "wrong" side of $20$, the $p$-value is the sum of those, i.e. $\frac{11}{2^{10}}=0.0107421875$, as given in the solution key.

The corresponding $p$-value for testing $10$ times and observing $8$ tests the "wrong" side of $20$ would have been $\frac{56}{2^{10}}=0.0546875 > 0.05$, which is why $9$ was the critical value for a $95\%$ test of $H_0$.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .