Introduction to Probability and Statistics for Engineers and Scientists

(Sean Pound) #1

11.4Tests of Independence in Contingency Tables 497


denote the number of sampled members havingY-characteristicj, the estimator forqjis


qˆj=

Mj
n

, j=1,...,s

At first glance, it may seem that we have had to use the data to estimater+sparameters.
However, since thepi’s andqj’s have to sum to 1 — that is,


∑r
i= 1 pi=

∑s
j= 1 qj=1—
we need estimate onlyr−1ofthep’s ands−1oftheq’s. (For instance, ifrwere
equal to 2, then an estimate ofp 1 would automatically provide an estimate ofp 2 since
p 2 = 1 −p 1 .) Hence, we actually need estimater− 1 +s− 1 =r+s−2 parameters,
and since each population member hask=rsdifferent possible values, it follows that the
resulting test statistic will, for largen, have approximately a chi-square distribution with
rs− 1 −(r+s−2)=(r−1)(s−1) degrees of freedom.
Finally, since


E[Nij]=nPij
=npiqj whenH 0 is true

it follows that the test statistic is given by


T=

∑s

j= 1

∑r

i= 1

(Nij−npˆiqˆj)^2
nˆpiˆqj

=

∑s

j= 1

∑r

i= 1

Nij^2
npˆiqˆj

−n

and the approximate significance levelαtest is to


reject H 0 if T≥χα^2 ,(r−1)(s−1)
not reject H 0 otherwise

EXAMPLE 11.4a A sample of 300 people was randomly chosen, and the sampled individ-
uals were classified as to their gender and political affiliation, Democrat, Republican, or
Independent. The following table, called acontingency table, displays the resulting data.


j
i Democrat Republican Independent Total
Women 68 56 32 156
Men 52 72 20 144
Total 120 128 52 300

Thus, for instance, the contingency table indicates that the sample of size 300 contained
68 women who classified themselves as Democrats, 56 women who classified themselves

Free download pdf