Basic Statistics

(Barry) #1

64 THE NORMAL DISTRIBUTION


of the normal distribution. The major reasons why the normal distribution is used
are given in Section 6.3. Three graphical methods for determining whether or not
data are normally distributed are presented in Section 6.4. Finally, in Section 6.5
techniques are given for finding suitable transformations to use when data are not
normally distributed.


6.1 PROPERTIES OFTHE NORMAL DISTRIBUTION

The mathematical formula for the normal frequency function need not concern us. It
is sufficient to note some of the properties of the shape of this distribution and how
we can use the area under the distribution curve to assist us in analysis.
First, we note that the area between a normal frequency function and the horizontal
axis is equal to one square unit. The curve is symmetric about the point at X = p and is
somewhat bell-shaped. The mean is at the center of the distribution, and the standard
deviation is at the inflection point where the curve goes from curving downward
to curving upward. It extends indefinitely far in both directions, approaching the
horizontal axis very closely as it goes farther away from the center point. It is thus
clear that men’s heights could not possibly be exactly normally distributed, even
if there were infinitely many heights, since heights below zero, for example, are
impossible, yet there is some area, however small, below zero under the normal
curve. The most that we can expect is that a very large set of heights might be well
approximated by a normal frequency function.
There are many normal curves rather than just one normal curve. For every different
value of the mean and the standard deviation, there is a different normal curve. Men’s
heights, for example, might be approximately normally distributed with a mean height
of 68 in., whereas the heights of 10-year-old boys could be normally distributed with
a mean height of 60 in. If the dispersion of the two populations of heights is equal,
the two frequency functions will be identical in shape, with one merely moved 8
units to the right of the other (see Figure 6.1). It may very well be, however, that
the variation among men’s heights is somewhat larger than the variation among boys’
heights. Suppose that the standard deviation for men’s heights is 3 in., whereas the
standard deviation for boys’ heights is 2.5 in. Men’s heights on the average will then
be farther from 68 in. than boys’ heights are from 60 in., so that the frequency curve
for men will be flatter and more spread out than the other. Figure 6.2 shows two such
normal frequency curves.
For every pair of numbers p and cr, then, a normal frequency curve can be plotted.
Further, only one normal frequency curve can be plotted with these particular values
for p and cr, Thus if it is known that heights are normally distributed with a mean
of 68 in. and a standard deviation of 3 in., the proper normal frequency distribution
can be plotted and the entire distribution pictured. It was pointed out earlier that two
sets of data might have the same means and the same standard deviations and still
have frequency distributions very different from each other; obviously by knowing
just the mean and standard deviation of a set of data, we could not draw a picture of
the frequency distribution. However, if in addition to knowing the mean and stan-

Free download pdf