From Appendix qwe find that with 35 df for MSerrorand rset at 5, the critical
value of qequals 4.07. If we use r 5 5 for all comparisons, we can calculate the mini-
mal difference we will need between means for the difference to be declared significant.
Thus, we declare all mean differences ( ) to be significant if they exceed 8.14 and
to be not significant if they are less than 8.14. For our data, the difference between
and 510 24 5 6, and the difference between and is 11 24 5 7. The
Tukey HSD test would declare them not significant because 6 and 7 are less than 8.14. The
differences between M-S and S-M is 20, and between M-S and Mc-M is 25, both of which
exceed 8.14. Thus M-S is not significantly different from M-M and from S-S. Neither, of
course, is the difference between M-M and S-S, which is a difference of 1. Therefore the
first three means (M-S, M-M, and S-S) form a homogeneous set, which is different from
S-M and Mc-M. Furthermore, S-M differs from Mc-M by 5 points, which again is not sig-
nificant, yielding another homogeneous set. We can write these as
(M-S 5 M-M 5 S-S) (S-M 5 Mc-M)
The equal signs indicate simply that we could not reject the null hypothesis of equality, not
that we have proven the means to be equal.
Unequal Sample Sizes and Heterogeneity of Variance
The Tukey procedure was designed primarily for the case of equal sample sizes
( ). Frequently, however, experiments do not work out as planned,
and we find ourselves with unequal numbers of observations and still want to carry out a
comparison of means. A good bit of work has been done on this problem with respect to
the Tukey HSD test (see particularly Games and Howell, 1976; Keselman and Rogan,
1977; Games, Keselman, and Rogan, 1981).
One solution, known as the Tukey–Kramer approach, is to replace with
and otherwise conduct the test the same way you would if the sample sizes were equal.
This is the default solution with SPSS.
An alternative, and generally preferable, test was proposed by Games and Howell
(1976). The Games and Howell procedure uses what was referred to as the Behrens–Fisher
approach to t tests in Chapter 7. The authors suggest that a critical difference between
means (i.e., ) be calculated separately for every pair of means using
Wr=Xi 2 Xj=q.05(r, df¿)
F
s^2 i
ni^1
s^2 j
nj
2
Wr
Q
MSerror
ni
1
MSerror
nj
2
1 MSerror>n
n 1 =n 2 5 Á 5 nk=n
Z
XM-S XS-S XM-S
XM-M
Xi 2 Xj
=4.07
B
32
8
=8.14
Xi 2 Xj=q0.05(r, df)
B
MSerror
n
392 Chapter 12 Multiple Comparisons Among Treatment Means