Statistical Methods for Psychology

(Michael S) #1
life expectancy is attributable to variability in smoking behavior. In other words, we want a
measure that represents

As we have seen, that measure is r^2. In other words,

This interpretation of r^2 is extremely useful. If, for example, the correlation between
amount smoked and life expectancy were an unrealistically high .80, we could say that
of the variability in life expectancy is directly predictable from the variability in
smoking behavior. (Obviously, this is an outrageous exaggeration of the real world.) If the
correlation were a more likely r 5 .10, we would say that of the variability in life
expectancy is related to smoking behavior, whereas the other 99% is related to other factors.
Phrases such as “accounted for by,” “attributable to,” “predictable from,” and “associated
with” are notto be interpreted as statements of cause and effect. Thus, you could say, “I can
predict 10% of the variability of the weather by paying attention to twinges in the ankle that I
broke last year—when it aches we are likely to have rain, and when it feels fine the weather
is likely to be clear.” This does not imply that sore ankles cause rain, or even that rain itself
causes sore ankles. For example, it might be that your ankle hurts when it rains because low
barometric pressure, which is often associated with rain, somehow affects ankles.
From this discussion it should be apparent that r^2 is easier to interpret as a measure of
correlation than is r, since it represents the degree to which the variability in one measure is
attributable to variability in the other measure. I recommend that you always square correla-
tion coefficients to get some idea of whether you are talking about anything important. In our
symptoms-and-stress example, r^2 Thus, about one-quarter of the variabil-
ity in symptoms can be predicted from variability in stress. That strikes me as an impressive
level of prediction, given all the other factors that influence psychological symptoms.
There is not universal agreement that r^2 is our best measure of the contribution of one
variable to the prediction of another, although that is certainly the most popular measure.
Judd and McClelland (1989) strongly endorse r^2 because, when we index error in terms of
the sum of squared errors, it is the proportional reduction in error (PRE).In other
words, when we do not use Xto predict Y, our error is. When we use Xas the predic-
tor, the error is SSresidual. Since

the value of 1 2 r^2 can be seen to be the percentage by which error is reduced when Xis
used as the predictor.^11
Others, however, have suggested the proportional improvement in prediction (PIP)
as a better measure.

For large sample sizes this statistic is the reductionin the size of the standard error of
estimate (see Table 9.4). Similarly, as we shall see shortly, it is a measure of the reduction
in the width of the confidence interval on our prediction.

PIP= 123 (1 2 r^2 )

r^2 =

SSY 2 SSresidual
SSY

SSY


=.529^2 =.280.


.10^2 =1%


.80^2 =64%


r^2 =SSYN
SSY

SSYN


SSY


=


SSY 2 SSresidual
SSY

Section 9.7 The Accuracy of Prediction 263

(^11) It is interesting to note that (defined on p. 252) is nearly equivalent to the ratio of the varianceterms corre-
sponding to the sums of squares in the equation. (Well, it is interesting to somepeople.)
r^2 adj
proportional
reduction in error
(PRE)
proportional
improvement in
prediction (PIP)

Free download pdf