Differences Between Log-Linear Models
and the Analysis of Variance
Although I have frequently compared the analysis of variance and log-linear models and
pointed to the many real similarities between the two techniques, this comparison may at
times lead to confusion. The purpose behind the models is not quite the same in the two
situations. The analysis of variance models cell means, whereas log-linear analysis models
cell frequencies.
To take a simple example, assume that we have an experiment looking at the effects of
Previous artistic experience and Gender (two independent variables) on the quality of a
written Composition (the dependent variable). First suppose that Composition is measured
on a continuous scale, that Artistic experience and Gender are dichotomies, and that we
have 20 male and 40 female subjects. Note those cell sizes; they are important! Further as-
sume that Gender has absolutely nothing to do with Composition. Then in an analysis of
variance framework with Gender ( ) included, our model would be
Here we would expect the main effect of Gender to be 0.00 because we have assumed
the condition that Gender does not influence Composition. On the other hand, if in fact dif-
ferences did exist between the quality of Composition for males and females, a significant
main effect would appear. The presence or absence of an effect due to Gender relates to
whether or not male and female subjects differ on the scores on Composition.
Now assume the same experiment, again with 20 males and 40 females, but this time
record Composition scores as high, medium, and low, and include Composition as a categori-
cal variable in our log-linear model. We fit a log-linear model to these data. This time even if
there are no differences in Composition between males and females, we will still need to in-
clude Gender in our model, and its effects will be significant. The reason is quite simple. With
our log-linear model we are nottrying to model mean Composition; we are trying to model
cell frequencies. We are not trying to ask whether males have better composition scores than
females. We are trying to explain why there are more scores in some cells than in others.
Those cells dealing with female subjects will have relatively larger frequencies than those
cells with male subjects (all other things equal) because there are more female subjects. Sim-
ilarly, if we had equal numbers of male and female subjects, even with huge differences in
quality of Composition between the two sexes, the effect of Gender would be 0.00.
I point this out, and will come back to it again, because it is too easy and seductive to
see Gender playing the same role in the two kinds of experiments. In fact, in asymmetric
log-linear models the main effects associated with our independent variables (and their in-
teractions with each other) are often of no interest whatsoever. They may merely reflect our
sampling plan. They need to be included to model the data properly, but they do not have a
substantive role. In such models it is the interactionof these variables that is of interest
(and that parallels main effects in the analysis of variance.) If we assume that there are gen-
der differences in composition, then the main effect of gender in our analysis of variance
becomes a Gender 3 Composition interaction in our log-linear model.
17.4 Odds and Odds Ratios
Before moving to complex designs, there are two other basic concepts that are more easily
explained with simple tables than with higher-order tables. These concepts were discussed
in Chapters 6 and 15, but deserve review.
Xijk=m1ai1bj1abij 1 eijk
bj
Section 17.4 Odds and Odds Ratios 641