many tests increases the probability that one or more of the comparisons will
result in a type I error (i.e., a significant test result when the null hypothesis is
true). This statement should make sense intuitively. For example, suppose that
the null hypothesis is true and we perform 100 tests—each has a 0.05 proba-
bility of resulting in a type I error; then 5 of these 100 tests would be statisti-
cally significant as the result of type I errors. Of course, we usually do not need
to do that many tests; however, every time we do more than one, the proba-
bility that at least one will result in a type I error exceeds 0.05, indicating a
falsely significant di¤erence! What is needed is a di¤erent way to summarize the
di¤erences between several means and a method ofsimultaneouslycomparing
these means in one step. This method is called ANOVA or one-way ANOVA,
an abbreviation ofanalysis of variance.
We have continuous measurementsX’s fromkindependent samples; the
sample sizes may or may not be equal. We assume that these are samples from
knormal distributions with a common variances^2 , but the means,mi’s, may or
may not be the same. The case where we apply the two-samplettest is a special
case of this one-way ANOVA model withk¼2. Data from theith sample can
be summarized into sample sizeni, sample meanxi, and sample variancesi^2 .If
we pool data together, the (grand) mean of this combined sample can be cal-
culated from
x¼
P
ðniÞðxiÞ
P
ðniÞ
In that combined sample of sizen¼
P
ni, the variation inXis measured
conventionally in terms of the deviationsðxijxÞ(wherexijis thejth mea-
surement from theith sample); the total variation, denoted by SST, is the sum
of squared deviations:
SST¼
X
i;j
ðxijxÞ^2
For example, SST¼0 when all observationxijvalues are the same; SST is the
numerator of the sample variance of the combined sample: The higher the SST
value, the greater the variation among allXvalues. The total variation in the
combined sample can be decomposed into two components:
xijx¼ðxijxiÞþðxixÞ
- The first term reflects the variationwithintheith sample; the sum
SSW¼
X
i;j
ðxijxiÞ^2
¼
X
i
ðni 1 Þs^2 i
is called thewithin sum of squares.
264 COMPARISON OF POPULATION MEANS