which they will never be. We need the intercept because it forces the average of our predic-
tions to equal the average of the obtained values, but we rarely pay any real attention to it.
We can use this regression equation in exactly the same way we used the simple regression
equation in Chapter 9. Simply substitute the values of Expend and LogPctSAT for a given
state and you can predict that state’s mean SAT score. To take the state of Colorado as an
example, our predicted mean SAT score would be
Because the actual mean for Colorado was 980, we have somewhat underestimated the mean
and our residual erroris 980 – 944.352 5 35.648. That is a small residual given the relative
magnitude of SAT scores. On the other hand, the residual for West Virginia is 2 61.510.
The positive coefficient for Expend tells us that now that we have controlled LogPctSAT
the relationship between expenditures and performance is positive—the more the state
spends, the higher their (adjusted) SAT score. That should make us feel much better. We can
also see that when we control Expend, the relationship between LogPctSAT and SAT is neg-
ative, which makes sense. I explained earlier why increasing the percentage of a state’s stu-
dents taking the SAT would be expected to lower the overall mean for that state.
But you may have noticed that LogPctSAT itself had a correlation of 2 .93 with SAT,
and perhaps Expend wasn’t adding anything important to the relationship—after all, the
correlation only increased to .941. If you look at the table of coefficients, you will see two
columns on the right labeled tand sig. These relate to significance tests on the regression
coefficients. You saw similar ttests in Chapter 9. From the “sig.” column we can tell that
all three coefficients are significant at p,.05. The intercept has no meaning because it
would refer to a case in which a state spent absolutely nothing on education and had 0 per-
cent of its students taking the SAT. The coefficient for Expend is meaningful because it
shows that increased spending does correlate with higher scores after we control for the
percentage of students taking the exam.Similarly, after we control for expenditures, SAT
scores are higher for those states who have few (presumably their best) students taking the
test. So although adding Expend to LogPctSAT as predictors didn’t raise the correlation
very much, it was a statistically significant contributor.
I discussed above one of the ways of interpreting what a multiple regression means—
for any predictor variable the slope is the relationship between that variable and the crite-
rion variable if we could hold all other variables constant. And by “hold constant” we mean
having a collection of participants who had all the same scores on each of the other vari-
ables. But there are two other ways of thinking about regression that are useful.
Another Interpretation of Multiple Regression
When we just correlate Expend with SAT and completely ignore LogPctSAT, there is a cer-
tain amount of variability in the SAT scores that is directly related to variability in LogPct-
SAT, and that was what was giving us that peculiar negative result. What we would really
like to do is to examine the correlation between Expend and the SAT score when both are
adjusted to be free from the influences of LogPctSAT. To put it another way, some of the
differences in SAT are due to differences in Expend and some are due to differences in
LogPctSAT. We want to eliminate those differences in both variables that can be attributed
to LogPctSAT and then correlate the adjusted variables. That is actually a lot simpler than
it sounds. I can’t imagine anyone intentionally running a multiple regression the way that I
am about to, but it does illustrate what is going on.
=1147.113 1 11.130(5.443) 2 78.205(3.367)=944.352
=1147.113 1 11.130(5.443) 2 78.205Log(29)
YN =1147.113 1 11.130( Expend) 2 78.205(LogPctSAT )
524 Chapter 15 Multiple Regression
residual error