Statistical Methods for Psychology

(Michael S) #1
In the case of the analysis of variance, we first posited this model to underlie the ob-
tained data and then used the model and its associated error term to develop tests of the
components of that model. When the data analysis was complete, we let the model stand
but made statements of the form, “There is a significant effect due to variable Aand the
A 3 Binteraction, but there is no significant difference due to variable B.”
In Chapter 15, Section 15.10 on stepwise multiple regression, we reversed the process.
We used the data themselves to create a model rather than using an a priori model as in the
analysis of variance. Using the backward solution, which is most relevant here, we contin-
ued to remove variables from our model so long as their removal did not produce a signifi-
cant decrement in (or until we met some similar criterion). When we were done we were
left with a model all of whose components contributed significantly to the prediction of Y.
In the case of log-linear models, we generally fall somewhere between these two
approaches. We use a model-building approach, as in the regression situation, but the re-
sultant model may, as in the analysis of variance, contain nonsignificant terms.
Consider Pugh’s data and a variety of different models that mightbe posited to account
for those data. I don’t remotely believe that the first few models are likely to be true, but
they are possible models, and they are models that you must understand. Moreover, they
are models that might be included in a complete analysis, if only to serve as a basis for
comparison of alternative models.

Equiprobability Model


At the simplest level, we might hypothesize that respondents distribute themselves among
the four cells at random. In others words, p(Low, Guilty) 5 p(High, Guilty) 5 p(Low, Not
Guilty) 5 p(High, Not Guilty) 5 .25. This model basically says that nothing interesting is
going on in this study and one-quarter of the subjects (.25 3358 5 89.5) would be expected
to fall in each cell.
Using the likelihood ratio to test this model, we have

This can be evaluated as a on 4 21 53 df(we lose one degree of freedom due to
the restriction that the cell totals must sum to N), and from Appendix we find that

. Clearly, we can reject and conclude that this model does not fit the data.
In other words, the individual cell frequencies cannot be fit by a model in which all cells
are considered equally probable. Notice that rejection of is equivalent to rejection of the
underlying model. This is an important point and comes up whenever we are trying to de-
cide on a suitable model.


Conditional Equiprobability Model


Our first model really had no variable contributing to the observed frequency (not differ-
ences due to Fault, not differences due to Verdict, and not differences due to the interaction of
those variables). A second model, however, might hold that the individual cell frequencies

H 0


x^2 .05(3)=7.82 H 0

x^2

x^2

=109.5889


= 2 a153 ln

153


89.5


1 24 ln

24


89.5


1 105 ln

105


89.5


1 76 ln

76


89.5


b

x^2 = (^2) afij ln a
fij
Fij
b
Observed: 153 24 105 76
Expected: 89.5 89.5 89.5 89.5
x^2


R^2


Section 17.1 Two-Way Contingency Tables 633
Free download pdf