Statistical Methods for Psychology

(Michael S) #1
coefficient (slope) that we obtain is the same coefficient we find in the multiple regres-
sion solution.


  • We can think of the multiple correlation as the simple Pearson correlation between the
    criterion (call it Y) and another variable (call it ) that is the best linear combination of
    the predictor variables.
    The Educational Testing Service, which produces the SAT, tries to have everyone put a
    disclaimer on results broken down by states that says that the SAT is not a fair way to com-
    pare the performance of different states. Having gone through this example you can see
    that one reason that they say this is that different states have different cohorts of students
    taking the exam, and this makes the test inappropriate as a way of judging a state’s
    performance, even if it is a good way of judging the performance of individuals. We could
    create a new variable that is the SAT score adjusted for LogPctSAT, but I would be very
    wary of using that measure to compare states. It is possible that it would be fair, but it is
    also possible that there are a number of other variables that I have not taken into account.


15.2 Using Additional Predictors


Before we look at other characteristics of multiple regression we should ask what would
happen if we used additional variables to predict SAT. We have two potential variables in
our data that we have not used—the pupil/teacher ratio and teacher’s salaries. We could add
both of them to what we already have, but I am only going to add PTratio. Folklore would
have it that a lower ratio would be associated with better performance. At the same time,
lower pupil/teacher ratios cost money, so PTratio should overlap with Expend and might
not contribute significant new information.
Table 15.3 shows the results of using Expend, LogPctSAT, and PTratio to predict SAT.
There are several things to say about this table.
The regression equation that results from this analysis is now

Notice that Expend and LogPctSAT are still significant (t 5 3.302 and 2 17.293, respec-
tively, but PTratio is far from significant (t 5 .418). This shows us that adding PTratio to our
model did not improve our ability to predict. (Even the simple correlation between PTratio
and SAT was not significant (r 5 .081).) You will see two new columns in Table 15.3, label
Toleranceand VIF (Variance Inflation Factor).When predictor variables are correlated
among themselves we have what is called collinearityor multicollinearity.Collinearity has
the effect of increasing the standard error of a regression coefficient, which increases the

YN =1132.033 1 11.665 Expend 2 78.393 PctSAT 2 0.742 PTratio

YN


15.2 Using Additional Predictors 527

Coefficientsa

Model
1 Constant
Expend
LogPctSAT
PTratio

B
1132.033
11.665
–78.393
.742

Std. Error
39.787
3.533
4.533
1.774

Beta

.212
–1.042
.022

t
28.452
3.302
–17.293
.418

Sig.
.000
.002
.000
.678

Tolerance

.596
.679
.854

VIF

1.679
1.473
1.171

Unstandardized
Coefficients Collinearity Statistics

Standardized
Coefficients

aDependent Variable: SAT

Table 15.3 Adding PTratio to the prediction equation

Tolerance


VIF (Variance
Inflation Factor)


collinearity


multicollinearity

Free download pdf