Key Terms
If we have available measures on the covariate and are free to assign subjects to treat-
ment groups, then we can form subsets of subjects who are homogeneous with respect to
the covariate, and then assign one member of each subset to a different treatment group. In
the analysis of variance, we can then pull out an effect due to blocks (subsets) from the
error term.
The use of matched samples and the analysis of covariance are almost equally effective
when the regression of Yon Cis linear. If requals the correlation in the population
between Yand C, and represents the error variance in a straight analysis of variance on
Y, then the use of matched samples reduces the error variance to
The reduction due to the analysis of covariance in this situation is given by
where is the degrees of freedom for the error variance. Obviously, for any reasonable
value of , the two procedures are almost equally effective, assuming linearity of regres-
sion. If the relationship between Yand Cis not linear, however, matching will be more
effective than covariance analysis.
A second alternative to the analysis of covariance concerns the use of difference
scores. If the covariate (C) represents a test score before the treatment is administered and
Ya score on the same test after the administration of the treatment, the variable C 2 Yis
sometimes used as the dependent variable in an analysis of variance to control for initial
differences on C. Obviously, this approach will work only if Cand Yare comparable meas-
ures. We could hardly justify subtracting a driving test score (Y) from an IQ score (C). If
the relationship between Cand Yis linear and if 5 1.00, which is rarely true, the analy-
sis of difference scores and the analysis of covariance will give the same estimates of the
treatment effects. When is not equal to 1, the two methods will produce different re-
sults, and in this case it is difficult to justify the use of difference scores. In fact, for the
Conti and Musty (1984) data on THC, if we took the differencebetween the Pre and Post
scores as our dependent variable, the results would be decidedly altered ( ).
In this case, the analysis of covariance was clearly a more powerful procedure. Exercise
16.24 at the end of the chapter illustrates this view of the analysis of covariance. For a more
complete treatment of this entire problem, see Harris (1963) and Huitema (1980, 2005).
The thing to keep in mind here is that a slope of one on the relationship between pre-
and post-test scores implies that the intervention led to a similar increase in scores, regard-
less of where people started. But it might be that the change is proportional to where
people started out. Someone who is very poor in math may have much more to gain by an
intervention program than someone who was doing well, and thus the gain score will be
directly (and negatively) related to the pretest score. In the example from Conti and Musty
(1984), more active animals were likely to change more than less active animals, which
may be why they took as their dependent variable the posttest score as a percentage of the
pretest score, rather than just the difference between their two scores.
F4,42=0.197
bCY
bCY
fe
fe
s^2 e(12r^2 )
(fe)
(fe 2 1)
s^2 e(12r^2 )
s^2 e
622 Chapter 16 Analyses of Variance and Covariance as General Linear Models
General linear model (16.1)
Design matrix (16.1)
Method III (16.4)
Method II (16.4)
Method I (16.4)
Hierarchical sums of squares (16.4)
Sequential sums of squares (16.4)
Analysis of covariance (16.5)
Covariate (16.5)
difference scores