this formula gives only the position, and not the actual
value, of the median. For example, in a study describing
the number of hours that five (n = 5) sedentary subjects
spent watching television last week, the following data were
collected: {4, 6, 3, 38, 6} hours. To determine the median,
the researcher would perform a number of easy steps.
- Sort the data: 3 4 6 6 38
- Determine the median’s location. The median is at the
(5 + 1)/2, or third, position from either end. - Count over three places to find that the median is 6. In this example, 6 is
also the mode because it is the most frequently occurring value.
If the data set is changed to {3, 4, 6, 6}, then the position of the median
would be 2.5, or halfway between the second and third sorted data points. In
this example, the median is 5, and 6 is the mode. If the data set is changed again
to include {3, 4, 6, 100}, the position of the median still will be 2.5, or halfway
between the second and third sorted data points, and the median remains 5. This
example demonstrates that the outlier value of 100 did not affect the median
value. For this reason, the median is generally used to describe average when
there is an extreme value in the data.
When data are grouped, the median is determined using cumulative fre-
quencies. In Table 13-6 age data are reported for 20 subjects. The median is
located at (20 + 1)/2, the 10.5th position, and it is the average of the 10th and
11th data values. In the ungrouped data, the median is 21 years old, while the
grouped median is 20–21 years.
Mean
When people refer to an average, what they are really referring to is the mean.
The mean is calculated by adding all of the data values and then dividing by
the total number of values. The mean is the most commonly used measure of
central tendency. It is greatly affected by the existence of outliers because every
value in the data set is included in the calculation. The larger the sample size,
the less an outlier will affect the mean. For example, using the previous data
example about number of hours spent watching television, subjects reported
watching 4, 6, 3, 38, and 6 hours of television per week. Using these data, the
mean is calculated to be 11.4 hours (57 / 5 = 11.4). However, 11.4 hours does
not present a clear picture of the amount of television watched by most subjects
because the one extreme value of 38 hours skews the data. Because the mean is
the measure of central tendency used in many tests of statistical significance,
it is imperative for researchers to evaluate data for outliers before performing
statistical analyses.
FYI
The three most commonly used measures
of central tendency are the mean, median,
and mode. The mean and median are used
to describe continuous-level data, while the
mode is used to describe both continuous-
and nominal-level data.
KEY TERM
mean: The
mathematical
average calculated
by adding all values
and then dividing
by the total number
of values
13.3 Measures of Central Tendency 339