We know that if we ran the simple regression predicting SAT from LogPctSAT alone,
the resulting set of predicted scores would represent that part of SAT that is predictable
from LogPctSAT. If we subtract the predicted scores from the actual scores, the resulting
residuals, call them ResidSAT, will be that part of SAT that is notpredictable from (is
independent of) LogPctSAT. We can now do the same thing predicting Expend from
LogPctSAT. We will get the predicted scores, subtract them from the obtained scores, and
have a new set of residuals, call them ResidExpend, that is also independent of LogPctSAT.
So we now have two sets of residual scores—ResidSAT and ResidExpend that are both in-
dependent of LogPctSAT. So LogPctSAT can play no role in their relationship.^2
If I now run the regression to predict the adjusted SAT score from the adjusted Expend
score (i.e., ResidSAT with ResidExpend) I will have
Model R R Square Adjusted
R Square
Std. Error of
the Estimate
(^1) .445a .198 .182 25.51077434
15.1 Multiple Linear Regression 525
Coefficientsa
Model
1 (Constant)
Unstandardized
Residual
B
2 3.1E-015
11.130
Std. Error
3.608
Beta
.445
t
.000
3.446
Sig.
1.000
.001
Unstandardized
Coefficients
Standardized
Coefficients
aDependent Variable: Unstandardized Residual
aPredictors: (Constant), Unstandardized Residual
bDependent Variable: Unstandardized Residual
Notice that the regression coefficient predicting the adjusted SAT score from the ad-
justed expend score is 11.130, which is exactly what we had for Expend doing things the
normal way. Notice also that the following table shows us that the correlation between these
two corrected variables is .445, which is the correlation between Expend and SAT after we
have removed any effects attributable to LogPctSAT. (Also notice that it is now positive.)
Model Summaryb
I hope that no one thinks that they should actually do their regression this way. The
reason I went through the exercise was to make the point that when we have multiple pre-
dictor variables we are adjusting each predictor for all other predictors in the equation. And
the phrases “adjusted for,” “controlling,” and “holding constant” all are ways of saying the
same thing.
A Final Way to Think of Multiple Regression
There is a third way to think of multiple regression, and in some ways I find it the most
useful. We know that in multiple regression we solve for an equation of the form
YN =b 1 X 11 b 2 X 21 b 0
(^2) In SPSS it is very easy to obtain these residuals. From the main regression window just click on the “Save” but-
ton and select “Unstandardized” residuals. They will be added to your data file when you run the regression.
residuals