Introductory Biostatistics

(Chris Devlin) #1
eij¼ncppijij

¼

xiþxþj
n

¼
ðrow totalÞðcolumn totalÞ
sample size

Thefeijgare calledestimated expected frequencies, the frequencies we expect to
have under the null hypothesis of independence. They have the same marginal
totals as those of the data observed. In this problem we do not compare pro-
portions (because we have only one sample), what we really want to see is if the
two factors or variablesX 1 andX 2 arerelated; the task we perform is atest for
independence. We achieve that by comparing the observed frequencies, thex’s,
versus those expected under the null hypothesis of independence, the expected
frequenciese’s. This needed comparison is done through Pearson’s chi-quare
statistic:


w^2 ¼

X


i;j

ðxijeijÞ^2
eij

For large samples,X^2 has approximately a chi-square distribution with
degrees of freedom under the null hypothesis of independence,


df¼ðI 1 ÞðJ 1 Þ

with greater values lead to a rejection ofH 0.


Example 6.10 In 1979 the U.S. Veterans Administration conducted a health
survey of 11,230 veterans. The advantages of this survey are that it includes a
large random sample with a high interview response rate, and it was done
before the public controversy surrounding the issue of the health e¤ects of
possible exposure to Agent Orange. The data shown in Table 6.13 relate Viet-
nam service to having sleep problems among the 1787 veterans who entered the
military service between 1965 and 1975. We have


TABLE 6.13


Service in Vietnam

Sleep Problems Yes No Total


Yes 173 160 333
No 599 851 1450


Total 772 1011 1783


224 COMPARISON OF POPULATION PROPORTIONS

Free download pdf