Statistical Methods for Psychology

coefficient (slope) that we obtain is the same coefficient we find in the multiple regression solution.

We can think of the multiple correlation as the simple Pearson correlation between the
criterion (call it Y) and another variable (call it ) that is the best linear combination of
the predictor variables.
The Educational Testing Service, which produces the SAT, tries to have everyone put a
disclaimer on results broken down by states that says that the SAT is not a fair way to com-
pare the performance of different states. Having gone through this example you can see
that one reason that they say this is that different states have different cohorts of students
taking the exam, and this makes the test inappropriate as a way of judging a state’s
performance, even if it is a good way of judging the performance of individuals. We could
create a new variable that is the SAT score adjusted for LogPctSAT, but I would be very
wary of using that measure to compare states. It is possible that it would be fair, but it is
also possible that there are a number of other variables that I have not taken into account.

15.2 Using Additional Predictors

Before we look at other characteristics of multiple regression we should ask what would happen if we used additional variables to predict SAT. We have two potential variables in our data that we have not used—the pupil/teacher ratio and teacher’s salaries. We could add both of them to what we already have, but I am only going to add PTratio. Folklore would have it that a lower ratio would be associated with better performance. At the same time, lower pupil/teacher ratios cost money, so PTratio should overlap with Expend and might not contribute significant new information. Table 15.3 shows the results of using Expend, LogPctSAT, and PTratio to predict SAT. There are several things to say about this table. The regression equation that results from this analysis is now

Notice that Expend and LogPctSAT are still significant (t 5 3.302 and 2 17.293, respec- tively, but PTratio is far from significant (t 5 .418). This shows us that adding PTratio to our model did not improve our ability to predict. (Even the simple correlation between PTratio and SAT was not significant (r 5 .081).) You will see two new columns in Table 15.3, label Toleranceand VIF (Variance Inflation Factor).When predictor variables are correlated among themselves we have what is called collinearityor multicollinearity.Collinearity has the effect of increasing the standard error of a regression coefficient, which increases the

YN =1132.033 1 11.665 Expend 2 78.393 PctSAT 2 0.742 PTratio

YN

15.2 Using Additional Predictors 527

Coefficientsa

Model 1 Constant Expend LogPctSAT PTratio

B 1132.033 11.665 –78.393 .742

Std. Error 39.787 3.533 4.533 1.774

Beta

.212 –1.042 .022

t 28.452 3.302 –17.293 .418

Sig. .000 .002 .000 .678

Tolerance

.596 .679 .854

VIF

1.679 1.473 1.171

Unstandardized Coefficients Collinearity Statistics

Standardized Coefficients

aDependent Variable: SAT

Table 15.3 Adding PTratio to the prediction equation

Tolerance

VIF (Variance
Inflation Factor)

collinearity

multicollinearity

Statistical Methods for Psychology

15.2 Using Additional Predictors

YN

Get our desktop app

Company

Features

Documentation

Resources