Biostat Methods STAT 5820/6910
Handout #4: Chi-square, Fisher’s, and McNemar’s Tests
Example 1: 152 patients were randomly assigned to 4 dose groups in a clinical study. During the
course of the study, some patients dropped out. Is there a difference in dropout rates among dose
groups?
Dropout
Yes No
Dose 10 5 35 40
20 6 29 35
40 10 28 38
80 12 27 39
33 119 152
data a1;
input dose dropout $ count @@; cards;
10 yes 5 10 no 35
20 yes 6 20 no 29
40 yes 10 40 no 28
80 yes 12 80 no 27
;
proc freq data=a1;
tables dose*dropout / chisq nopercent norow;
weight count;
title1 'Testing equal rates in all doses';
run;
/* No evidence of a general association, but
evidence of a dose-response trend.
Could investigate linear trend using Cochran-Armitage test */
1
Testing equal rates in all doses
The FREQ Procedure
Frequency Table of dose by dropout
Col Pct dose dropout
no yes Total
10 35 5 40
29.41 15.15
20 29 6 35
24.37 18.18
40 28 10 38
23.53 30.30
80 27 12 39
22.69 36.36
Total 119 33 152
Statistics for Table of dose by dropout
Statistic DF Value Prob
Chi-Square 3 4.7831 0.1884
Likelihood Ratio Chi-Square 3 4.9008 0.1792
Mantel-Haenszel Chi-Square 1 4.2171 0.0400
Phi Coefficient 0.1774
Contingency Coefficient 0.1747
Cramer's V 0.1774
2
/* Any different inference in exact test? */
proc freq data=a1;
tables dose*dropout / fisher;
weight count;
title1 'Fishers exact test';
run;
Fishers exact test
Fisher's Exact Test
Table Probability (P) 0.0007
Pr <= P 0.1888
Example 2: Bilirubin data
86 patients treated with experimental drug for 3 months; pre- and post-study bilirubin
levels were recorded. Many patients exhibited abnormally high bilirubin levels.
Posttest Level
Normal High Normal High
Pre 74 12 two representations Pretest Normal 60 14
Post 66 20 Level High 6 6
Is there evidence of a change in pre- to post-treatment rates of abnormalities?
χ2=2.46, P-value=0.1170
But – are treatment groups independent?
Also – what is the true sample size?
/* What if we assumed independence of treatment groups? */
data temp; input trt $ bilirubin $ count; cards;
pre normal 74
pre high 12
post normal 66
post high 20
;
3
proc freq data=temp;
tables trt*bilirubin / chisq;
weight count;
title1 'Assume Independence';
run;
Assume Independence
Frequency Table of trt by bilirubin
Row Pct trt bilirubin
Col Pct high normal Total
post 20 66 86
23.26 76.74
62.50 47.14
pre 12 74 86
13.95 86.05
37.50 52.86
Total 32 140 172
Statistics for Table of trt by bilirubin
Statistic DF Value Prob
Chi-Square 1 2.4571 0.1170
Likelihood Ratio Chi-Square 1 2.4788 0.1154
Continuity Adj. Chi-Square 1 1.8813 0.1702
Mantel-Haenszel Chi-Square 1 2.4429 0.1181
Phi Coefficient 0.1195
Contingency Coefficient 0.1187
Cramer's V 0.1195
4
/* An alternative representation of the data
*/
data a2w; input pretrt $ posttrt $ count; cards;
normal normal 60
normal high 14
high normal 6
high high 6
;
/* Is there evidence of a change in pre- to post-treatment
rates of abnormalities? */
proc freq data=a2w;
tables pretrt*posttrt / agree norow nocol;
weight count;
title1 'McNemars test';
run;
McNemars test
Statistics for Table of pretrt by posttrt
The FREQ Procedure McNemar's Test
Frequency Table of pretrt by posttrt
Statistic (S) 3.2000
Percent pretrt posttrt
DF 1
high normal Total
Pr > S 0.0736
high 6 6 12
Simple Kappa Coefficient
6.98 6.98 13.95
Kappa 0.2430
normal 14 60 74
ASE 0.1211
16.28 69.77 86.05
95% Lower Conf Limit 0.0057
Total 20 66 86 95% Upper Conf Limit 0.4802
23.26 76.74 100.00
Sample Size = 86
5
/* Get equivalent results using patient-level data:
pre = 1 iff abnormally high pre-test
post = 1 iff abnormally high post-test
*/
data a2; input pre post @@; cards;
0 0 0 0 0 0 0 0 0 0 0 1
1 1 0 0 0 0 0 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0 0 0
0 0 0 1 0 0 1 0 0 0 0 0
1 0 0 0 0 0 1 1 0 1 0 1
0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 1 0 1 0 0 0 0 0 1
0 0 1 0 0 0 0 0 1 1 0 0
0 0 0 1 1 0 0 0 0 1 0 0
1 1 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 0 1 1 1 0 0
0 0 0 1 0 0 0 1 0 0 0 0
0 0 0 0 1 1 0 0 0 0 0 0
0 0 0 0
;
proc freq data=a2;
tables pre*post / agree norow nocol;
title1 'McNemars test, again';
run;
6
Example 3: Two tests are being considered (call them Method A and Method B) to check blood
samples and diagnose whether or not a patient has a particular condition. 100 patients provide
blood samples, and each sample is run through both methods. We are interested in whether there
is a difference in the two methods' diagnostic abilities.
/* Compare McNemar and Fishers */
/* Define data */
data a1; input methodA $ methodB $ count; cards;
Y Y 18
N Y 27
Y N 35
N N 20
;
/* Check McNemar's and Fishers */
proc freq data=a1;
tables methodA*methodB /
agree chisq fisher nopercent nocol norow;
weight count;
title1 'McNemar and Fishers Tests';
run;
McNemar and Fishers Tests
Fisher's Exact Test
Cell (1,1) Frequency (F) 20
Left-sided Pr <= F 0.0154 McNemar's Test
Right-sided Pr >= F 0.9949 Statistic (S) 1.0323
DF 1
Table Probability (P) 0.0103 Pr > S 0.3096
Two-sided Pr <= P 0.0265
7
/* Define re-configured data based on column and row sums */
data a2; input method $ diagnosis $ count; cards;
A Y 53
A N 47
B Y 45
B N 55
;
/* Check McNemar's and Fishers */
proc freq data=a2;
tables method*diagnosis /
agree chisq fisher nopercent nocol norow;
weight count;
title1 'McNemar and Fishers Tests';
title2 'Reconfigured Data';
run;
McNemar and Fishers Tests
Reconfigured Data
Fisher's Exact Test
Cell (1,1) Frequency (F) 47 McNemar's Test
Left-sided Pr <= F 0.1611 Statistic (S) 0.0370
Right-sided Pr >= F 0.8986 DF 1
Pr > S 0.8474
Table Probability (P) 0.0596
Two-sided Pr <= P 0.3221