The major disadvantage of all subsets regression, aside from the enormous amount of
computer time it can involve, is the fact that it has a substantial potential for capitalizing on
chance. By fitting all possible models to the data, or at least the best of all possible models,
you run the serious risk of selecting those models that best fit the peculiar data points that
are unique to your data set. The final cannot reasonably be thought of as an unbiased es-
timate of the corresponding population parameter.
Backward Elimination
The backward eliminationprocedure, as well as the stepwise regression procedure to
follow, are generally lumped under the term stepwise proceduresbecause they go about
their task in a logical stepwise fashion. They both have the advantage of being easy to carry
out interactively using standard regression procedures, although programs to carry them
out automatically are readily available.
In the backward elimination procedure, we begin with a model that includes all of the
predictors. Having computed that model, we examine the tests on the individual regression
coefficients, or look at the partial or semipartial correlations and remove the variable that
contributes the least to the model (assuming that its contribution is statistically nonsignifi-
cant). We then rerun the regression without that predictor, again looking for the variable
with the smallest contribution, remove that, and continue. Normally we continue until we
come to a model in which all of the remaining predictors are statistically significant,
although alternative stopping points are possible. For example, we could plot or
against the number of predictors in the model and stop when that curve shows a
break in direction.
Most computer programs that run backward elimination or stepwise regression use
some combination of terms called “Fto enter,” “Fto remove,” “pto enter,” and “pto
remove.” To take just one of these, consider “pto remove.” If we plan to remove predictors
from the model if they fail to reach significance at a5.05, then we set “pto remove” at
.05. The “Fto remove” would simply be the critical value of Fcorresponding to that level
of p.^10 (Those programs that calculate t statistics instead of Fwould simply make the
appropriate change.) The situation is actually more complicated than I have made it seem
(see Draper & Smith, 1981, p. 311), but for practical purposes it is as I have described.
An important disadvantage of backward elimination is that it too capitalizes on chance.
Because it begins with many predictors, it has the opportunity to identify and account for
any suppressor relations among variables that can be found in the data. For example, if
variables 7 and 8 have some sort of suppressor relationship between them, this method has
a good chance of finding it and making those variables a part of the model. If that is a true
relationship, then backward elimination has done what we want it to. On the other hand, if
the relationship is spurious, we have just wasted extra variables explaining something that
does not deserve explanation. Darlington (1990, p. 166) made this point about both back-
ward elimination and all subsets regression. True suppressor relationships are fairly rare,
but apparent ones are fairly common. Therefore, methods that systematically look for
them, especially without accompanying hypothesis tests, may be misleading more often
than simpler methods that ignore them.
MSresidual
R^2
R^2
548 Chapter 15 Multiple Regression
(^10) As Draper and Smith (1981) point out, when we are testing optimal models the Fstatistics are not normal Fs
and their probability values should not be interpreted as if they were. Thus, although both Fand pform the basis
of a legitimate ordering of potential variables, do not put too much faith in the actual probabilities. McIntyre,
Montgomery, Srinwason, and Weitz (1983) address this problem directly and illustrate the liberal nature of the
test. They also provide guidelines on more appropriate tests on stepwise correlation coefficients, should you wish
to follow this route.
backward
elimination