Basic Statistics

(Barry) #1

52 MEASURES OF LOCATION AND VARIABILITY


large values there are, the median can be obtained since it is dependent on only the
numerical value of the middle or middle two ordered observations.


5.1.3 Other Measures of Location


Other measures of the center of a distribution are sometimes used. One such measure
is the mode. The mode is the value of the variable that occurs most frequently. In
order to contrast the numerical values of the mean, mode, and median from a sample
that has a distribution that is skewed to the right (long right tail), let us look at a
sample of n = 11 observations. The numerical values of the observations are 1,2,2,
2,2,3,3,4,5,6,7. The mode is 2, since that is the value that occurs most frequently.
The median is 3, since that is the value of the sixth or (n + 1)/2 observation. The
mean is 3.36. Although small, this sample illustrates a pattern common in samples
that are skewed to the right. That is, the mean is greater than the median, which in
turn is greater than the mode.


5.2 MEASURES OF VARIABILITY

After obtaining a measure of the center of a distribution such as a mean, we next
wonder about the variability of the distribution and look for a number that can be
used to measure how spread out the data are. Two distributions could have the same
mean and look quite different if one had all the values closely clustered about the mean
and the other distribution was widely spread out. In many instances, the dispersion
of the data is of as much interest as is the mean. For example, when medication is
manufactured, the patient expects that each pill not only contains the stated amounts
of the ingredients, on average, but that each pill contains very close to the stated
amount. The patient does not want a lot of variation.
The concept of variation is a more difficult one to get used to than the center or
location of the distribution. In general, one wishes a measure of variation to be large
if many observations are far from the mean and to be small if they are close to the
mean.


5.2.1

Starting with the idea of examining the deviation from the mean, X -x, for all values
of X, we might first think of simply summing them. That sum has been shown to be
zero. We could also try summing the absolute deviations from the mean, C IXi -XI.
That is, convert all negative values of X - x to positive numbers and sum all values.
This might seem like a promising measure of variability to use, but absolute values
are not as easy to work with as squared deviations (X - X)2 and also lack several
desirable properties of the squared differences. (Note that the squared deviation is
also always positive.)
The sample variance is defined as the sums of squares of the differences between
each observation in the sample and the sample mean divided by 1 less than the number

The Variance and the Standard Deviation
Free download pdf