Introductory Biostatistics

(Chris Devlin) #1

co¤ee, tobacco, and alcohol. Study subjects consist of 188 women in the San
Francisco Bay area with epithelial ovarian cancers diagnosed in 1983–1985,
and 539 control women. Of the 539 controls, 280 were hospitalized women
without ovarian cancer and 259 were chosen from the general population by
random telephone dialing. Data for co¤ee consumption are summarized in
Table 6.16 (the numbers in parentheses are expected frequencies). In this ex-
ample, we want tocomparethe three proportions of co¤ee drinkers, but we still
can apply the same chi-square test:


X^2 ¼


ð 177  170 : 42 Þ^2
170 : 42

þþ

ð 26  24 : 23 Þ^2
24 : 23
¼ 3 : 83

The result indicates that the di¤erence between the three groups is not signifi-
cant at the 5% level (the cutpoint at the 5% level for chi-square with 2 degrees
of freedom is 5.99). In other words, there is enough evidence to implicate co¤ee
consumption in this study of epithelial ovarian cancer. It is important to note
that in solving the problem above, a comparison of several proportions, one
may be tempted to compare all possible pairs of proportions and do many chi-
square tests. What is the matter with this approach to doing many chi-square
tests, one for each pair of samples? As the number of groups increases, so does
the number of tests to perform; for example, we would have to do 45 tests if we
have 10 groups to compare. Obviously, the amount of work is greater, but that
is not the critical problem, especially with technological aids such as the use of
calculators and computers. So what is the problem? The answer is that per-
forming many tests increases the probability that one or more of the compar-
isons will result in a type I error (i.e., a significant test result when the null
hypothesis is true). This statement should make sense intuitively. For example,
suppose that the null hypothesis is true and we perform 100 tests—each has a
0.05 probability of resulting in a type I error; then 5 of these 100 tests would be
statistically significant as the results of type I errors. Of course, we usually do
not need to do that many tests; however, every time we do more than one, the
probability that at least one will result in a type I error exceeds 0.05, indicating
a falsely significant di¤erence! What is needed is a method of comparing these
proportionssimultaneously, in one step. The chi-square test for a general two-
way table, in this case a 23 table, achieves just that.


TABLE 6.16


Co¤ee
Drinkers Cases


Hospital
Controls

Population
Controls Total

Yes 177 (170.42) 249 (253.81) 233 (234.77) 659
No 11 (17.58) 31 (26.19) 26 (24.23) 68


Total 188 280 259 727


228 COMPARISON OF POPULATION PROPORTIONS

Free download pdf