Encyclopedia of Sociology

(Marcin) #1
ANALYSIS OF VARIANCE AND COVARIANCE

adequately randomized across groups. As a result,
group differences in outcome scores may be found
and erroneously attributed to the effect of the
experimental stimulus or group condition, when
in fact the differences between groups existed
prior to, or independent of, the presence of the
stimulus or group condition.


One example of this situation is provided by
Roberta Simmons and Dale Blyth (1987). In a
study of the effects of different school systems on
the changing self-esteem of boys and girls as they
make the transition from sixth to seventh grade,
these researchers had to account for the fact that
boys and girls in these different school systems had
different levels of self-esteem in sixth grade. Since
those who score high on a measure at one point in
time (T1) will have a statistical tendency to score
lower at a later time (T2) and vice versa (a negative
relationship), these initial differences could lead
to erroneous conclusions. In their study, if boys
had higher self-esteem than girls in sixth grade, the
statistical tendency would be for boys to experi-
ence negative change in self-esteem and girls to
experience positive change even though seventh-
grade girls in certain school systems experience
more negative influences on their self-esteem.


The procedure used in adjusting for covariates
involves a combination of analysis of variance and
linear regression techniques. Prior to comparing
group means or sources of variation, the outcome
scores are adjusted based upon the effect of the
covariate(s). This is done by computing predicted
outcome scores based on the equation:


Y=a+ b 1 X 1 (^8 )

where Ŷ is the new adjusted outcome score, a is a
constant, and b 1 is the linear effect of the covariate
(X 1 ) on the outcome score (Y). The difference
between the actual score and the predicted score
(Yij − Ŷij) is the residual. These residuals represent
that part of the individuals’ scores that is not
explained by the covariate. It is these residuals that
are then analyzed using the analysis of variance
techniques described above. If the effect of the
covariate is negative, then those who scored high
on the covariate will have their scores adjusted
upward. Those who scored lower on the covariate
would have their scores adjusted downward. This
would counteract the reverse effect that the covariate


has had. Group differences could then be assessed
after the scores have been corrected.
This model can be expanded to include any
number of covariates and is particularly useful
when analyzing the effects of a discrete indepen-
dent variable (e.g., gender, race, etc.) on a continu-
ous outcome variable using survey data, where
other factors cannot be randomly assigned and
where conditions cannot be standardized. In such
situations, other preexisting difference between
groups (often, variables measured on continuous
scales) need to be statistically controlled for. In
these cases, researchers often perform analysis of
variance and analysis of covariance within the
context of what has been termed the ‘‘general
linear model.’’

GENERAL LINEAR MODEL

The general linear model refers to the application of
the linear regression equation to solve analysis
problems that initially do not meet the assump-
tions of linear regression analysis. Specifically,
there are three situations where the assumptions
of linear regression are violated but regression
techniques can still be used: (1) the use of nominal
level measures (e.g., race, religion, marital status)
as independent variables—a violation of the as-
sumption that all variables be measured at the
interval or ratio level; (2) the existence of interac-
tion effects between independent variables—a vio-
lation of the assumption of additivity of effects;
and (3) the existence of a curvilinear effect of the
independent variable on the dependent variable—
a violation of the assumption of linearity. The
linear regression equation can be applied in all of
these situations provided that certain procedures
and operations on the variables are carried out.
The use of the general linear model for perform-
ing analysis of variance and analysis of covariance
is described in greater detail below.
Regression with dummy variables. In situa-
tions where the dependent variable is measured at
the interval level of measurement (ordered values
at fixed intervals) but one or more independent
variables are measured at the nominal level (no
order implied between values), analysis of vari-
ance and covariance procedures are usually more
appropriate than linear regression. Linear regres-
sion analysis can be used in these circumstances,
however, as long as the nominal level variables are
Free download pdf