that large deviations are disproportionately represented. You might keep this in mind the next
time you use a measuring instrument that is “OK because it is unreliable only at the ex-
tremes.” It is just those extremes that may have the greatest effect on the interpretation of the
data. This is one of the major reasons why we don’t particularly like to have skewed data.
The Coefficient of Variation
One of the most common things we do in statistics is to compare the means of two or more
groups, or even two or more variables. Comparing the variability of those groups or vari-
ables, however, is also a legitimate and worthwhile activity. Suppose, for example, that we
have two competing tests for assessing long-term memory.
One of the tests typically produces data with a mean of 15 and a standard deviation of
3.5. The second, quite different, test produces data with a mean of 75 and a standard devia-
tion of 10.5. All other things being equal, which test is better for assessing long-term mem-
ory? We might be inclined to argue that the second test is better, in that we want a measure
on which there is enough variability that we are able to study differences among people,
and the second test has the larger standard deviation. However, keep in mind that the two
tests also differ substantially in their means, and this difference must be considered.
If you think for a moment about the fact that the standard deviation is based on devia-
tions from the mean, it seems logical that a value could more easily deviate substantially
from a large mean than from a small one. For example, if you rate teaching effectiveness on
a 7-point scale with a mean of 3, it would be impossible to have a deviation greater than 4.
On the other hand, on a 70-point scale with a mean of 30, deviations of 10 or 20 would be
common. Somehow we need to account for the greater opportunity for large deviations in
the second case when we compare the variability of our two measures. In other words,
when we look at the standard deviation, we must keep in mind the magnitude of the mean
as well.
The simplest way to compare standard deviations on measures that have quite different
means is simply to scale the standard deviation by the magnitude of the mean. That is what
we do with the coefficient of variation (CV).^12 We will define that coefficient as simply
the standard deviation divided by the mean:
(We multiply by 100 to express the result as a percentage.) To return to our memory-task
example, for the first measure, CV 5 (3.5/15) 3100 5 23.3. Here the standard deviation
is approximately 23% of the mean. For the second measure, CV 5 (10.5/75) 3100 5 14.
In this case the coefficient of variation for the second measure is about half as large as for
the first. If I could be convinced that the larger coefficient of variation in the first measure
was not attributable simply to sloppy measurement, I would be inclined to choose the first
measure over the second.
To take a second example, Katz, Lautenschlager, Blackburn, and Harris (1990) asked
students to answer a set of multiple-choice questions from the Scholastic Aptitude Test^13
(SAT). One group read the relevant passage and answered the questions. Another group an-
swered the questions without having read the passage on which they were based—sort of
CV=
Standard deviation
Mean
=
sX
X
3100
44 Chapter 2 Describing and Exploring Data
(^12) I want to thank Andrew Gilpin (personal communication, 1990) for reminding me of the usefulness of the
coefficient of variation. It is a meaningful statistic that is often overlooked.
(^13) The test is now known simply as the SAT, or, more recently, the SAT-I.
coefficient of
variation (CV)