Statistical Methods for Psychology

(Michael S) #1
Another measure of importance, which has much to recommend it, is the squared semi-
partial correlation between predictor iand the criterion (with all other predictors partialled
out)—that is,. Darlington (1968) refers to this measure as the “usefulness” of a
predictor. As we have already seen, this squared semipartial correlation represents the
decrement in that would result from the elimination of the ith predictor from the model
(or the increment that would result from its addition). When the main goal is prediction
rather than explanation, this is probably the best measure of “importance.” Fortunately, it is
easy to obtain from most computer printouts, because

where is the Ftest on the individual (or ) coefficients. (If your program uses t tests
on the coefficient, .) Because all terms but are constant for i 5 1... p, the s
order the variables in the same way as do the squared semipartials, and thus can be used to
rank order the variables in terms of their usefulness.
Darlington (1990) has made a strong case for not squaring the semipartial correlation
when speaking about the importance of variables. His case is an interesting one. However,
whether or not the correlations are squared will not affect the ordering of variables. (If you
wish to argue persuasively about the absolute importance of a variable, you should read
Darlington’s argument.)
One common, but unacceptable, method of ordering the importance of variables is to
rank them by the order of their inclusion in a stepwise regression solution. The problem
with this approach is that it ignores the interrelationships among the variables. Thus, the
first variable to be entered is entered solely on the strength of its correlation with the cri-
terion. The second variable entered is chosen on the basis of its correlation with the cri-
terion after partialling the first variable but ignoring all others. The third is chosen on the
basis of how it correlates with the criterion after partialling the first two variables, and so
on. In other words, each variable is chosen on a different basis, and it makes little sense
to rank them according to order of entry. To take a simple example, assume that variables
1, 2, and 3 correlate .79, .78, and .32 with the criterion. Assume further that variables
1 and 2 are correlated .95, whereas 1 and 3 are correlated .20. They will then enter the
equation in the order 1, 3, and 2, with the last entry being nonsignificant. But in what
sense do we mean to say that variable 3 ranks above variable 2 in importance? I would
hate to defend such a statement to a reviewer—in fact, I would be hard pressed even to
say what I meant by importance in this situation. A similar point has been made well by
Huberty (1989). For an excellent discussion of measures of importance, see Harris
(1985, 79ff).

15.13 Using Approximate Regression Coefficients


I have pointed out that regression coefficients frequently show substantial fluctuations from
sample to sample without producing drastic changes in R. This might lead someone to sug-
gest that we might use rather crude approximations of these coefficients as a substitute for
the more precise estimates obtained from the data. For example, suppose that a five-predictor
problem produced the following regression equation:

We might ask how much loss we would suffer if we rounded these values to
YN = 1011 X 112 X 221 X 314 X 422 X 5

YN =9.2 1 0.85X 11 2.1X 22 0.74X 31 3.6X 42 2.4X 5


F=t^2 Fi Fi

Fi bi bi

r^2 0(i.123...p)=

Fi(1 2 R^2 0.123...p)
N 2 p 21

R^2


r^2 0(i.123...p)

552 Chapter 15 Multiple Regression

Free download pdf