7
$\begingroup$

I am advising a small medical study (two groups, treatment is a dummy variable), i.e. a 2x2 contingency table. I'm comparing the value of the Pearson's $\chi^2$ test and a non parametric competitor McNemar's $\chi^2$ test.

Edit

The answer brings an additional questions:

they have matched each case in group1 with 4 controls in group2 (matched according to all the variables they deem important, such as sex, age, hemisf) except treatment. Setting aside the question of whether the matching is well done (i.e. whether all important variables have indeed been identified), doesn't the fact that to each case corresponds four controls (i.e. not one) artificially inflate the significance of the results? (they still have to send me part of the data, this is why the table below does not reflect this 4 to 1 ratio).

I obtain the following (very different) results (n=116) (R CrossTable() function).:

     [,1] [,2]
[1,]   39    9
[2,]   49   19

Statistics for All Table Factors

Pearson's Chi-squared test with Yates' continuity correction

Chi^2 = 0.844691 d.f. = 1 p = 0.3580586

McNemar's Chi-squared test with continuity correction

Chi^2 = 26.22414 d.f. = 1 p = 3.039988e-07

Fisher's Exact Test for Count Data

Sample estimate odds ratio: 1.672924

Alternative hypothesis: true odds ratio is not equal to 1 p = 0.279528 95% confidence interval: 0.6356085 4.692326

The McNemar is the approximate version, but the exact version gives the same conclusions (strong rejection of the null).

My question is: how can i understand such a large difference between $\chi^2$ and McNemar ?

$\endgroup$
4
  • 4
    $\begingroup$ The McNemar test is designed for paired data, isn't it? So you cannot compare its output with that of Pearson or Fisher which assumes independent samples... $\endgroup$
    – chl
    Commented Sep 13, 2010 at 13:40
  • $\begingroup$ OK, can you put your comment as answer? $\endgroup$
    – user603
    Commented Sep 13, 2010 at 14:08
  • 2
    $\begingroup$ Don't know if your udpate calls for a new answer, anyway I'll continue to comment on. For a given # of cases, power for detecting an effect can be increased by selecting more controls than cases (which also come into play in CIs width). Power increases as one moves from a 1:1 to a 1:2 matching but the benefits are less interesting above that ratio. However, it is still useful to have more controls in case of exclusion during sensitivity analysis or things like that. If you're interested in CC design, look at this Lancet series, j.mp/aZWRrg $\endgroup$
    – chl
    Commented Sep 13, 2010 at 18:29
  • 1
    $\begingroup$ This is a good example of not properly defining your hypothesis test - i.e. what question you want to answer. Just compute the chi-square statistic and whether or not the null was rejected, without actually saying what the null is. Computer packages are incredibly good at making this easy. $\endgroup$ Commented Jul 2, 2011 at 12:28

1 Answer 1

12
$\begingroup$

Because the null hypothesis your McNemar test tests, is not the same as the one tested by the $\chi^2$ test. The McNemar test actually tests whether the probability of 1-2 equals that of 2-1 (say first number is the row and second the column). If you switch columns, then you get a completely different outcome. The $\chi^2$ test just tests whether the frequencies can be calculated from the marginal frequencies, which means both categorical variables are independent.

To illustrate in R :

> x <- matrix(c(39,49,9,19),ncol=2)

> y <-x[,2:1]

> x
     [,1] [,2]
[1,]   39    9
[2,]   49   19

> y
     [,1] [,2]
[1,]    9   39
[2,]   19   49

> mcnemar.test(x)

        McNemar's Chi-squared test with continuity correction

data:  x 
McNemar's chi-squared = 26.2241, df = 1, p-value = 3.04e-07


> mcnemar.test(y)

        McNemar's Chi-squared test with continuity correction

data:  y 
McNemar's chi-squared = 6.2241, df = 1, p-value = 0.01260


> chisq.test(x)

        Pearson's Chi-squared test with Yates' continuity correction

data:  x 
X-squared = 0.8447, df = 1, p-value = 0.3581


> chisq.test(y)

        Pearson's Chi-squared test with Yates' continuity correction

data:  y 
X-squared = 0.8447, df = 1, p-value = 0.3581

It is obvious the result of the McNemar test is completely different depending on which column comes first, whereas the $\chi^2$ test gives exactly the same outcome. Now why is this one non-significant? Well, take a look at the expected values :

> m1 <- margin.table(x,1)/116

> m2 <- margin.table(x,2)/116

> outer(m1,m2)*116
         [,1]     [,2]
[1,] 36.41379 11.58621
[2,] 51.58621 16.41379

Pretty close to the table you have.

So both tests are not disagreeing at all. The $\chi^2$ rightfully concludes that both variables are independent, i.e. the counts in one variable are not influenced by the other and vice versa, and the McNemar test rightfully concludes that the probability of being first row-second column (0.07) is not the same as being second row-first column (0.42).

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.