634 Chapter 17 Log-Linear Analysis
(^3) To emphasize this point assume that I wanted to look at Gender and Age in studying Empathy, and suppose that
I had 150 women and 10 men in my sample. Then Gender will almost certainly have to be in the model even if it
has absolutely nothing to do with empathy—how else would we explain the radical difference in the number of
males and females in the study?
Table 17.2 Observed and expected frequencies for the first conditional
equiprobability model
Verdict
Guilty Not Guilty Total
Low 153 24 177
(129) (50)
Fault High 105 76 181
(129) (50)
Total 258 100 358
=37.3960
= 2 a153 ln
153
129
1 24 ln
24
50
1 105 ln
105
129
1 76 ln
76
50
b
x^2 = (^2) afija
fij
Fij
b
represent differences due to assignment of Verdict, because noticeably more people were found
guilty than were found innocent. Notice that Verdict is likely to be an important variable not be-
cause it has any theoretical significance, but because there were more Guilty verdicts than
NonGuilty ones, and we have to take that into account.^3 By this model, 258/358 5 72.1% of
the observations fall in column 1 and 27.9% fall in column 2. Beyond that, however, observa-
tions are assumed to be equally likely to fall in rows 1 and 2. In other words, the null hypothesis
states that once we have conditioned on the judgment of guilt or innocence (i.e., adjusted for
the fact that more people were judged guilty than not guilty), assignment to Fault levels is
equally probable. By this model we would have the expected frequencies (shown in parenthe-
ses) contained in Table 17.2. (The expected frequencies in this model came from assuming that
half of the column 1 total would fall in row 1 and half in row 2; similarly for column 2.)
This model has 4 22 5 2 degrees of freedom because we have imposed two
restrictions—the cell frequencies in each column must sum to the expected frequency
for that column. Because , we will again reject and conclude that the
model does not fit the observed data either.
A second conditional equiprobability model could be created by assuming that cell fre-
quencies are affected only by differences in levels of Fault. In this case probabilities are
equal within each Fault condition but different between them. The expected frequencies in
this case are given in Table 17.3.
This has 4 22 5 2 degrees of freedom for the same reason that the model in
Table 17.2 did, and again the significant shows that this model is an inadequate fit to
the data. Thus, we have so far concluded that the data cannotbe explained by assuming
that observations fall in the four cells at random. Nor can they be explained by positing
differences due simply to an unequal distribution across either Verdict or Fault. More
would appear to be happening in the data. The next step would be to propose a model
involving both Verdict and Fault operating independently of one another. This is the
standard null model routinely tested by a chi-square test on a contingency table. It is so
standard that we often lose sight of the fact that it is the model we usually test (and hope
to reject).
x^2
x^2
x^2 .05(2)=5.99 H 0