Statistical Methods for Psychology

(Michael S) #1

Figure 15.1 Histograms, Q-Q plots, and scatter plots of the variables used in this example


15.1 Multiple Linear Regression 519

3

Frequency^0

5

10

10

15

4567
Expend

Expend

8 9 10 −2

4

6

8

10

−1 0 1 2
Theoretical quantiles

Expend

Sample quantiles

12

Frequency^0

5

15

15

14 16 18 20
P/T ratio

P/T Ratio P/T Ratio P/T Ratio

22 24 26

800 900 1000
SAT

SAT combined SAT combined

1100

Frequency 0

4

8

12

Frequency 0

5

10

02040
Pct SAT

PctSAT PctSAT PctSAT

Log(PctSAT)

60 80

4

850

1000

5678910
Expend

Expend

SAT

−2

14

18

22

−1 0 1 2
Theoretical quantiles

Sample quantiles

−2

20

60

−1 0 1 2
Theoretical quantiles

Sample quantiles

−2

850

1000

−1 0 1 2
Theoretical quantiles

Sample quantiles

14

850

1000

16 18 20 22 24
P/T ratio

SAT

850

1000

20 40 60 80
PctSAT

SAT

1.5 2.0 2.5 3.0 3.54.0 4.5

850

1000

LogpctSAT

SAT

the ACT unless they are applying to prestigious schools on either coast, such as Harvard,
Princeton, Berkeley, or Stanford. This is certainly an overly sweeping generalization, but
it will become important shortly.
Before we consider the regression solution itself, we need to look at the distribution of
each variable. These are shown for several variables as histograms, Q-Q plots, and scatter-
plots in Figure 15.1. It is clear from these plots that our variables are not normally distrib-
uted. From these displays it is apparent that the criterion variable and three of the predictors
are fairly well distributed. The distribution of the percentage of students taking the SAT is
definitely bimodal, reflecting the fact that each test is either taken by most students in that
state or by few. In addition the relationship between PctSAT and SAT score is curvilinear,
in part reflecting that bimodality. The distribution becomes slightly better when we take a
loge transformation of PctSAT, and its relationship with SAT is more linear. The scatterplot
against the SAT is shown in the lower right. We will make use of this transformed variable
instead of PctSAT itself because it makes an important point, though its distribution is still
decidedly bimodal. The combined SAT score shows a wide distribution.
Free download pdf