The output here looks like what we computed. You would get the same general results
if you had selected Analyze/General Linear Model/Univariatefrom the menus, although
the summary table would contain additional lines of information that I won’t discuss until
the end of this chapter.
11.7 Unequal Sample Sizes
Most experiments are originally designed with the idea of collecting the same number of
observations in each treatment. (Such designs are generally known as balanced designs.)
Frequently, however, things do not work out that way. Subjects fail to arrive for testing, or
are eliminated because they fail to follow instructions. Animals occasionally become ill
during an experiment from causes that have nothing to do with the treatment. I still recall
an example first seen in graduate school in which an animal was eliminated from the study
for repeatedly biting the experimenter (Sgro & Weinstock, 1963). Moreover, studies con-
ducted on intact groups, such as school classes, have to contend with the fact that such
groups nearly always vary in size.
If the sample sizes are not equal, the analysis discussed earlier needs to be modified. For
the case of one independent variable, however, this modification is relatively minor.
(A much more complete discussion of the treatment of missing data for a variety of analysis
of variance and regression designs can be found in Howell (2008), or, in slightly simpler
form, at http://www.uvm.edu/~dhowell/StatPages/More_Stuff/Missing_Data/Missing.html)
Earlier we defined
We were able to multiply the deviations by n, because nwas common to all treatments. If
the sample sizes differ, however, and we define as the number of subjects in the jth treatment
, we can rewrite the expression as
which, when all are equal, reduces to the original equation. This expression shows us
that with unequal ns, the deviation of each treatment mean from the grand mean is
weighted by the sample size. Thus, the larger the size of one sample relative to the others,
the more it will contribute to , all other things being equal.
Effective Therapies for Anorexia
The following example is taken from a study by Everitt that compared the effects of two
therapy conditions and a control condition on weight gain in anorexic girls. The data are
reported in Hand et al., 1994. Everitt used a control condition that received no intervention,
a cognitive-behavioral treatment condition, and a family therapy condition. The dependent
variable analyzed here was the gain in weight over a fixed period of time. The data are
given in Table 11.5 and plotted in Figure 11.3. Although there is some tendency for the
Cognitive-behavior therapy group to be bimodal, that tendency is probably not sufficient to
distort our results. (A nonparametric test [see Chapter 18] that is not influenced by that bi-
modality produces similar results.)
The computation of the analysis of variance follows, and you can see that the change
required by the presence of unequal sample sizes is minor. I should hasten to point out that
unequal sample sizes will not be so easily dismissed when we come to more complex designs,
but there is no particular difficulty with the one-way design.
SStreat
nj
SStreat= a 3 nj(Xj 2 X..)^24
Aanj=NB
nj
SStreat=na(Xj 2 X..)^2
332 Chapter 11 Simple Analysis of Variance
balanced designs