Statistical Methods for Psychology

(Michael S) #1
The Mean
Of the three principal measures of central tendency, the mean is by far the most common.
It would not be too much of an exaggeration to say that for many people statistics is nearly
synonymous with the study of the mean.
As we have already seen, certain disadvantages are associated with the mean: It is in-
fluenced by extreme scores, its value may not actually exist in the data, and its interpreta-
tion in terms of the underlying variable being measured requires at least some faith in the
interval properties of the data. You might be inclined to politely suggest that if the mean
has all the disadvantages I have just ascribed to it, then maybe it should be quietly forgot-
ten and allowed to slip into oblivion along with statistics like the “critical ratio,” a statisti-
cal concept that hasn’t been heard of for years. The mean, however, is made of sterner stuff.
The mean has several important advantages that far outweigh its disadvantages. Proba-
bly the most important of these from a historical point of view (though not necessarily from
your point of view) is that the mean can be manipulated algebraically. In other words, we
can use the mean in an equation and manipulate it through the normal rules of algebra,
specifically because we can write an equation that defines the mean. Because you cannot
write a standard equation for the mode or the median, you have no real way of manipulat-
ing those statistics using standard algebra. Whatever the mean’s faults, this accounts in
large part for its widespread application. The second important advantage of the mean is
that it has several desirable properties with respect to its use as an estimate of the popula-
tion mean. In particular, if we drew many samples from some population, the sample
means that resulted would be more stable (less variable) estimates of the central tendency
of that population than would the sample medians or modes. The fact that the sample mean
is generally a better estimate of the population mean than is the mode or the median is a
major reason that it is so widely used.

Trimmed Means


Trimmed meansare means calculated on data for which we have discarded a certain per-
centage of the data at each end of the distribution. For example, if we have a set of 100 ob-
servations and want to calculate a 10% trimmed mean, we simply discard the highest 10
scores and the lowest 10 scores and take the mean of what remains. This is an old idea that
is coming back into fashion, and perhaps its strongest advocate is Rand Wilcox (Wilcox,
2003, 2005).
There are several reasons for trimming a sample. As I mentioned in Chapter 1, and will
come back to repeatedly throughout the book, a major goal of taking the mean of a sample
is to estimate the mean of the population from which that sample was taken. If you want a
good estimate, you want one that varies little from one sample to another. (To use a term
we will define in later chapters, we want an estimate with a small standard error.) If we
have a sample with a great deal of dispersion, meaning that it has a lot of high and low
scores, our sample mean will not be a very good estimator of the population mean. By trim-
ming extreme values from the sample our estimate of the population mean is a more stable
estimate.
Another reason for trimming a sample is to control problems in skewness. If you have
a very skewed distribution, those extreme values will pull the mean toward themselves and
lead to a poorer estimate of the population mean. One reason to trim is to eliminate the in-
fluence of those extreme scores. But consider the data from Bradley(1963) on reaction
times, shown in Figure 2.11. I agree that the long reaction times are probably the result of
the respondent missing the key, and therefore do not relate to strict reaction time, and could
legitimately be removed, but do we really want to throw away the same number of obser-
vations at the other end of the scale?

Section 2.7 Measures of Central Tendency 35

Trimmed means

Free download pdf