together the totals for the row and column in which the cell is located and dividing by the
total sample size (N). (These totals are known as marginal totals,because they sit at the
margins of the table.) If is the expected frequency for the cell in row iand column j,
and are the corresponding rowand column totals,and Nis the total number of observa-
tions, we have the following formula:^2
For our example
These are the values shown in parentheses in Table 6.4.
Calculation of Chi-Square
Now that we have the observed and expected frequencies in each cell, the calculation of
is straightforward. We simply use the same formula that we have been using all along, al-
though we sum our calculations over all cells in the table.
=7.71
=
(33 2 22.72)^2
22.72
1
(251 2 261.28)^2
261.28
1
(33 2 43.28)^2
43.28
1
(508 2 497.82)^2
497.72
x^2 =a
(O 2 E)^2
E
x^2
E 22 =
5413759
825
=497.72
E 21 =
541366
285
=43.28
E 12 =
2843759
825
=261.28
E 11 =
284366
825
=22.72
Eij=
RiCj
N
Cj
Eij Ri
146 Chapter 6 Categorical Data and Chi-Square
Table 6.4 Sentencing as a function of the race of the defendant—the victim was white
Death Sentence
Defendant’s Race Yes No Total
Nonwhite 33 (22.72) 251 (261.28) 284
White 33 (43.28) 508 (497.72) 541
Total 66 759 825
(^2) This formula for the expected values is derived directly from the formula for the probability of the joint oc-
currence of two independentevents given in Chapter 5 on probability. For this reason the expected values that
result are those that would be expected if were true and the variables were independent. A large discrepancy
in the fit between expected and observed would reflect a large departure from independence, which is what we
want to test.
H 0
marginal totals
row totals
column totals