Statistical Methods for Psychology

(Michael S) #1

Assumptions


One of the pleasant things about log-linear models is the relative absence of assumptions.
Like the more traditional chi-square test, log-linear analysis does not make assumptions
about population distributions, although it does assume, as does the Pearson chi-square,
that observations are independent. You may apply log-linear analysis in a wide variety of
circumstances, including even the analysis of badly distributed (ill-behaved) continuous
variables that have been classified into discrete categories.
The major problem with log-linear analysis is the same problem that we encountered
with traditional chi-square: The expected frequencies have to be sufficiently large to allow
the assumption that frequencies in each cell would be normally distributed over repeated
sampling. In the case of chi-square, we set the rule that all (or at least most) of the expected
frequencies should be at least 5. We also saw that serious departures from this rule were
probably acceptable, as long as all expected frequencies exceeded 1 and 80% were greater
than 5. However, in such cases we would have unacceptably low power. We have a similar
situation with log-linear analysis. Once again we require that at least all cells have expected
frequencies greater than 1 and that no more than 20% of the cells have expected frequen-
cies less than 5. The biggest problem comes with what are called sparse matrices,which
are contingency tables with a large number of empty cells. In these cases you may wish to
combine categories on the basis of some theoretical rationale, increase sample sizes,
collapse across variables, or do whatever you can to increase the expected frequencies.
Regardless of the effects such small cells have on the level of Type I errors, you are virtu-
ally certain to have very low levels of power.

Hierarchical and Nonhierarchical Models


Most, but not all, analyses of log-linear models involve what are called hierarchical mod-
els. You can think of a hierarchical modelas one for which the presence of an interaction
term requiresthe presence of all lower-order interactions and main effects involving the
components of that higher-order interaction. For example, suppose that we had four
variables, A, B, C, and D. If you include in the model the three-way interaction ACD, a
hierarchical model would also have to include A, C, D, AC, AD, and CD, because each of
these terms is a subset of ACD. Similarly, if your model included ABCand ABD, the
model would actually include A, B, C, D, AB, AC, BC, AD, and BD. It need not include
CD, ACD, BCD, or ABCD, because those are not components of either of the three-way
interactions.
Hierarchical models are in many ways parallel to models used in the analysis of vari-
ance. If you turn to any of the models in Chapters 13, 14, and 16 you will note that they are
all hierarchical—for a three-way analysis of variance all main effects and two-way interac-
tions are included, along with the three-way interaction. Just as in the analysis of variance,
the presence of a term in log-linear models does not necessarily mean that it will make a
significant contribution. (If we design a study having exactly as many males as females,
the contribution of Gender to a log-linear model will be precisely 0. We still usually in-
clude it in the model because of its influence on other expected frequencies.) SPSS
HILOGLINEAR, and SYSTAT TABLES handle only hierarchical models. On the other
hand, SPSS GENLOG, SPSS LOGLINEAR, SAS PROC CATMOD, and SYSTAT LOGIT
are capable of analyzing nonhierarchical models. We will deal only with hierarchical mod-
els in this chapter. Schafer (1997, p. 293) states that “A model that includes ABbut omits
Aallows Ato be related to B, but requires the average log-probability across levels of Bto
be the same within every level of A. Under ordinary circumstances one would not expect
this to happen except by chance.”

l

l

644 Chapter 17 Log-Linear Analysis


sparse matrices


hierarchical
model

Free download pdf