Return to Statistics

McNemar's Test


#BA = , #AB = , p <=

AA > > AB
BA > > BB

This non-parametric test uses matched-pairs of labels (A, B). It determines whether the proportion of A- and B-labels is equal for both members. It is a very good test when only nominal data are available, e.g., correct versus incorrect identification of stimuli. Essentially, McNemar's Test is a &negative=> Sign-Test in disguise. All (A, A) and (B, B) pairs are ignored and it is tested whether (A, B) is as likely as (B, A) by labelling the one as + and the other as - and performing a &negative=> Sign-Test on the number of + and - labels.
McNemar's Test is generally used when the data consist of paired observations of labels. An example is an identification experiment in which each subject has to identify two different "versions" of each stimulus. The labels are correct and error. What is tested is whether a correct identification of the first version and an error in the identification of the second version is more or less likely than the reverse. These data cannot be analyzed with a test on &N1=&x2=&N2=> binomial proportions because the two samples are not independent.

AB pairs are as likely as BA pairs.

Only that the pairs are matched.


Ignore the pairs with identical labels, count the pairs AB (n+) and the pairs BA (n-).

Level of significance:
n+ and n- are binomial distributed with p = q = 1/2 and N = (n+) + (n-).
If k is the smaller of (n+) and (n-) then:
p <= 2 * Sum (i=0 to k) {N!/(i!*(N-i)!)}/4
(with k! = k*(k-1)*(k-2)*...*1 is the factorial of k and 0! = 1)

If (n+) + (n-) = N > 25, then Z = (| n+ - n- | - 1)/sqrt( N ) can be approximated with a Standard Normal distribution. In our example, we calculate the exact probabilities upto N = 100.
For N > 30, the Student t-test can be used.

For McNemar's Test, the same remarks hold as for the &negative=> Sign-Test. In many cases, it is the only test that can be applied without making many unlikely assumptions. This is especially so because, e.g., error rates in identification experiments tend to be small. As a result, there often are too few relevant observations to use parametric tests.

Return to: Statistics