Basic Statistics

(Barry) #1
MEASURES OFVARIABILITY 53

of observations. The reason for dividing by n - 1 instead of n is given in Section 5.3,
but for now it is sufficient to remark that n - 1 is part of the definition of the variance.
The variance of the sample is usually denoted by s2 and the formula is written as

C(X - X)2
52 =
n-1
The square root of the variance is also used. It is called the standard deviation. The
formula for the standard deviation is

For small samples when computing the sample variance with a hand calculator,
it is sometimes easier to use an alternative formula for the variance. The alternative
formula can be given as
2 CX2-n(X)2
s=
n-1

Here each X given in Section 5.1.1 is squared and then summed to obtain 232. Since
the mean is 4, the mean squared is 16 and 16 times the sample size of 9 is 144. Then
232 - 144 = 88 is the numerical value in the numerator and 9 - 1 = 8 is the value
in the denominator, so the variance is 88/8 = 11.
The variance of a population is denoted by g2 (sigma squared) and the standard
deviation by 0 (sigma). If we have the entire population, we use the formula o2 =
C(X - P)~/N, where N is the size of the population. Note that N is used rather
than N - 1. In other words, we simply compute the average of the squared deviations
around the population mean. The mean is relatively easy to interpret since people
often think in terms of averages or means. But the size of the standard deviation is
initially more difficult to understand. We will return to this subject in Section 5.3.
For now, if two distributions have the same mean, the one that has the larger standard
deviation (and variance) is more spread out.
Variation is usually thought of as having two components. One is the natural
variation in whatever is being measured. For example, with systolic blood pressure,
we know that there is a wide variation in pressure from person to person, so we would
not expect the standard deviation to be very small. This component is sometimes
called the biological variation. The other component of the variation is measurement
error. If we measure something inaccurately or with limited precision, we may have
a large variation due simply to measurement methods. For a small standard deviation,
both of these contributors to variation must be small.
Changing the measurement scale of the sample data affects the standard deviation
and the variance although not in the same way that the mean is changed. If we add
a constant to each observation, the standard deviation does not change. All we have
done is shift the entire distribution by the same amount. This does not change the
standard deviation or the variance. If we multiply each observation by a constant, say
100, the size of the standard deviation will also be multiplied by 100. The variance
will be multiplied by 100 squared since it is the square of the standard deviation.

Free download pdf