Encyclopedia of Sociology

(Marcin) #1
ANALYSIS OF VARIANCE AND COVARIANCE

or group of scores by calculating an average or
mean. The mean score can be thought of as the
score for a ‘‘typical’’ person in the study and can be
used as a reference point for calculating the amount
of differences in scores across all individuals. The
difference between each score and the mean is
analogous to the difference of each score from
every other score. The variation of scores is calcu-
lated, then, by subtracting each score from the
mean, squaring it, and summing the squared de-
viations (squaring the deviations before adding
them is necessary because the sum of nonsquared
deviations from the mean will always be 0). A large
sums of squares indicates that the total amount of
deviation of scores from a central point in the
distribution of scores is large. In other words,
there is a great deal of variation in the scores either
because of a few scores that are very different from
the rest or because of many scores that are slightly
different from each other.


Decomposing the sums of squares. The total
amount of variation in a sample on some outcome
measure is referred to as the total sums of squares
(SStotal). This is a measure of how much the sub-
jects’ scores on the outcome variable differ from
one another and it represents the phenomenon
that the researcher is trying to explain (e.g., Why
do some seventh graders have high self-esteem,
while others have low or moderate self-esteem?).
The procedure for calculating the total sums of
squares is represented by the following equation:


SSTOTAL = ∑i∑j(Yij – Y..)^2 ( 1 )

where, Σi Σj indicates to sum across all individuals
(ī) in all groups (j), and (Yij − Y..)^2 is the squared
difference of the score of each individual (Yij) from
the grand mean of all scores (..= Σij Yij / N). In
terms of explaining variance, this is what the re-
searcher is trying to account for or explain.


The total sums of squares can then be ‘‘decom-
posed’’ or mathematically divided into two com-
ponents: the between-groups sums of squares (SSBETWEEN)
and the within-groups sums of squares (SSWITHIN).


The between-groups sums of squares is a meas-
ure of how much variation in outcome scores
exists between groups. It uses the group mean as
the best single representation of how each indi-
vidual in the group scored on the outcome meas-
ure. It essentially assigns the group mean score to


every subject in the group and then calculates how
much total variation there would be from the
grand mean (the average of all scores regardless of
group membership) if there was no variation with-
in the groups and the only variation comes from
cross-group comparisons. For example, in trying
to explain variation in adolescent self-esteem, a
researcher might argue that junior high schools
place children at risk because of schools’ size and
impersonal nature. If the school environment is
the single most powerful factor in shaping adoles-
cent self-esteem, then when adolescents are com-
pared to each other in terms of their self-esteem,
the only comparisons that will create differences
will be those occurring between students in differ-
ent school types, and all comparisons involving
children in the same school type will yield no
difference. The procedure for calculating the be-
tween-groups sums of squares is represented by
the following equation:

SSBETWEEN = ∑jNj(Y.j – Y..)^2 ( 2 )

where, Σj indicates to sum across all groups (j), and
Nj (.j − ..)^2 is the number of subjects in each
group (Nj) times the difference between the mean
of each group (.j) and the grand mean (..).

In terms of the comparison of means, the
between-groups sums of squares directly reflects
the difference between the group means. If there
is no difference between the group means, then
the group means will be equal to the grand mean
and the between-groups sums of squares will be 0.
If the group means are different from one anoth-
er, then they will also differ from the grand mean
and the magnitude of this difference will be re-
flected in the between-groups sums of squares. In
terms of explaining variance, the between-groups
sums of squares represents only those differences
in scores that come about because the individuals
in one group are compared to individuals in a
different group (e.g., What if all students in a given
type of school had the same level of self-esteem,
but students in a different school type had differ-
ent levels?). By multiplying the group mean differ-
ence score by the number of subjects in the group,
this component of the total variance assumes that
there is no other source of influence on the scores
(i.e., that the variance within groups is 0). If this
assumption is true, then the SSBETWEEN will be
equal to the SSTOTAL and the group effect could be
Free download pdf