Basic Statistics

(Barry) #1
MEASURES OF LOCATION 51

mean weight is 6 oz, then the total weight is 18 oz, a simple direct calculation. Other
measures of location do not have this property.
When we change the scale that we use to report our measurements, the mean
changes in a predictable fashion. For instance, if we add a constant to each obser-
vation, the mean is increased by the same constant. If we multiply each observation
by a constant, as, for example, in converting from meters to centimeters, the mean in
centimeters is 100 times the mean in meters.
In addition, the mean is readily available from statistical software programs or
hand calculators. It is by far the most commonly used measure of location of the
center of a distribution.

5.1.2 The Median

Another measure of location that is often used is the median, the number that divides
the total number of ordered observations in half. If we first order or rank the data from
smallest to largest, then if the sample size is an odd number, the median is the middle
observation of the ordered data. If the sample size is an even number, the median
is the mean of the middle two numbers. To find the median of the same data set of
observations used in Section 5.1.1 ( 8, 1, 2, 9, 3, 2, 8, 1, 2), we order them to obtain
1, 1, 2, 2, 2, 3, 8, 8, 9. With nine observations, n is an odd number and the median
is the middle or fifth observation. It has a value of 2. If n is odd, the median is the
numerical value of the (n + l)/2 ordered observation. If the first number were not
present in this set of data, the sample size would be even and the ordered observations
are 1, 2, 2, 2, 3, 8, 8, 9; then the median is the mean of the fourth and fifth number
observations, or (2 + 3)/2 = 2.5. In general, for n even, the formula for the median
is the mean of the n/2 and (n/2) + 1 observations.
The same definitions hold for the median of a finite population, with n replaced
by N. When the size of the population is so large that it must be considered infinite,
another definition must be used for the median. For continuous distributions it can
be defined as the number below which 50% of the population lie. For example, if
we found the value of X that divided the area of a frequency distribution such that
one-half of the area was to the right and one-half was to the left, we would have the
median of an infinite population.
The median provides the numerical value of a variable for the middle or most
typical case. If we wish to know the typical systolic blood pressure for patients
entering a clinic, the proper statistic is the median.
The median (m) has the property that the sum of the absolute deviations (deviations
that are treated as positive no matter whether they are positive or negative) of each
observation in a sample or population from the median is smaller than the sum of the
absolute deviations from any other number. That is, C IX, - ml is a minimum. If an
investigator wants to find a value that minimizes the sum of the absolute deviations,
the median is the best statistic to choose (see Weisberg [1992]).
The median can also be found in some instances where it is impossible to compute
a mean. For example, suppose it is known that one of the laboratory instruments
is inaccurate for very large values. As long as the investigator knows how many

Free download pdf