A 1 B 1 B 2 B 3 AB 11 AB 12 AB 13
The first step in a multiple-regression analysis is presented in Exhibit 16.2 using all
seven predictors (A 1 to AB 13 ). The results were obtained using SAS PROC CORR and
PROC REG, although every software package should give the same answers.
Exhibit 16.2 has several important features. First, consider the matrix of correlations
among variables, often symbolized as R. Suppose that we simplify this matrix by defining the
following sets of predictors:.
If we then rewrite the intercorrelation matrix, we have
ABAB
Notice that each of the effects is independent of the others. Such a pattern occurs only
if there are equal (or proportional) numbers of subjects in each cell; this pattern is also what
makes simplified formulae for the analysis of variance possible. The fact that this structure
disappears in the case of unequal ns is what makes our life more difficult when we have
missing subjects.
Next notice the vector b, labeled as the Parameter Estimate. The first entry (b 0 ) is
labeled Intercep and is the grand mean of all of the observations. The subsequent entries
(b 1 ... b 7 ) are the estimates of the corresponding treatment effects. Thus , ,
, and so on. Tests on these regression coefficients represent tests on the
corresponding treatment effects. The fact that we have only the (a 2 1)(b 2 1) 5 3 inter-
action effects presents no problem, due to the restrictions that these effects must sum to 0
across rows and down columns. Thus if ab 1252 1.34, then ab 22 must be 1 1.34. Similarly,
ab 1450 2ab 1 j52 ab 1 j 5 1.03.
The value of R^25 .625 represents the percentage of variation that can be accounted for
by all the variables simultaneously. With equal ns, and therefore independent effects, it is
equivalent to. The test onR^2 produces an
Fof 5.711 on 7 and 24 dfwhich, since it is significant (p 5 .0006), shows that there is
a nonchance relationship between the treatment variables, considered together, and the
dependent variable (Y).
Two more parallels can be drawn between Table 16.2, the analysis of variance, and
Exhibit 16.2, the regression analysis. First, notice that SSregression 5 SSModel 5 SSY(1 2 R^2 ) 5
231.969. This is the variation that can be predicted by a linear combination of the predic-
tors. This value is equal to SSA 1 SSB 1 SSAB, although from Exhibit 16.2 we cannot yet
partition the variation among the separate sources. Finally, notice that SSresidual 5 SSerror 5
SSY(1 2 R^2 ) 5 139.250, which is the error sum of squares in the analysis of variance. This
makes sense when you recall that error is the variation that cannot be attributed to the sepa-
rate or joint effects of the treatment variables.
h^2 A1h^2 B1h^2 AB=.014 1 .537 1 .074=.625
g g
b 5 =ab 11
b 1 =a 1 b 2 =b 1
A¿
B¿
AB¿
C
1.00 0.00 0.00
0.00 1.00 0.00
0.00 0.00 1.00
S
œ œ œ
A¿= 3 A 14 , B¿= 3 B 1 , B 2 , B 34 , and AB¿= 3 AB 11 , AB12, AB 134
X=
a 1 b 1
a 1 b 2
a 1 b 3
a 1 b 4
a 2 b 1
a 2 b 2
a 2 b 3
a 2 b 4
H
1100100
1010010
1001001
1 21 21 21 21 21 21
211002100
210100210
210010021
21 21 21 21111
X
Section 16.3 Factorial Designs 589