To calculate the variance, we will use Winsorized samples, in which the lowest 8 scores are
replaced with the 9th lowest score and the highest 8 scores are replaced with the 9th highest
score. This leaves us with samples of ni 5 40 scores, but only hi 5 24 of those are independ-
ent observations from the ith sample. If we let represent the variance of the Winsorized
sample of 40 scores, then the squared standard error of the mean for that sample would be
and the robust pairwise ttest on the difference between two means can be written as
Notice that we are not doing anything very surprising here. We are replacing means with
trimmed means and variances with variances that are based on Winsorized samples, but
with an adjustment to nito account for the trimming. Other than that, we have a standard t
test, and it can be used as a replacement for the tin any of the procedures we have dis-
cussed, or will discuss, in this chapter. There is one complication, however, and that refers
to the estimated degrees of freedom. The degrees of freedom are estimated as
That is a messy formula, but not very difficult to work out. As Keselman et al. (2005)
noted, “When researchers feel they are dealing with nonnormal data, they can replace the
usual least squares estimators of central tendency and variability with robust estimators
and apply these estimators in any of the previously recommended” multiple comparison
procedures.
One More Comment
I want to emphasize one more time that the Bonferroni test and its variants are completely
general. They are not the property of the analysis of variance or of any other statistical pro-
cedure. If you have several tests that were carried by any statistical procedure (and perhaps
by different procedures), you can use the Bonferroni approach to control FW. For example,
I recently received an e-mail message in which someone asked how they might go about ap-
plying the Bonferroni to logistic regression. He would do it the same way he would do it for
the analysis of variance. Take the set of statistical tests that came from his logistic regres-
sion, divide aby the number of tests he ran, and declare a test to be significant only if its re-
sulting probability was less than ac. You don’t even need to know anything about logistic
regression to do that.
12.4 Confidence Intervals and Effect Sizes for Contrasts
Having run a statistical significance test on the data from an experiment, and looked at in-
dividual comparisons, often called “individual contrasts,” we will generally want to look at
some measure of the amount of difference between group means. In Chapter 11 we saw
that when we have the omnibus F, which compares all means together, the most commonly
used measure is a member of the r-family measures, such as h^2 or v^2. However, when we
>
dfW=
As^2 WXi 1 s^2 WXjB^2
s^2 WXi(hi 2 1) 1 s^2 WXj(hj 2 1)
tW=
Yti 2 Ytj
3 s^2 Wi 1 s^2 WXj
s^2 WXi=
(ni 2 1)s^2 Wi
hi(hi 2 1)
sW^2 i
384 Chapter 12 Multiple Comparisons Among Treatment Means