Basic Statistics

(Barry) #1

70 THE NORMAL DISTRIBUTION


6.2.3 Interpreting Areas as Probabilities

In the foregoing example, areas under portions of the normal curve were interpreted
as percentages of men whose heights fall within certain intervals. These areas may
also be interpreted as probabilities. Where 84.13% of the area was below 70.3 in., the
statement was made that 84.13% of men’s heights are below 70.3 in. We may also
say that if a man’s height is picked at random from this population of men’s heights,
the probability is .8413 that his height will be < 70.3 in. Here the term probability is
not defined rigorously. The statement, “the probability is 3413 that a man’s height
is < 70.3 in.” means that if we keep picking a man’s height at random from the
population, time after time, the percentage of heights < 70.3 in. should come very
close to 84.13%.
Similarly, we may say that the probability that a man’s height is > 70.3 in. is
.1587; the probability that a man’s height is between 65.7 and 70.3 in. is .6826; the
probability that a man’s height is < 73.4 in. is .99; the probability that a man’s height
is > 73.4 in. is .01; and so on.


6.3 IMPORTANCE OFTHE NORMAL DISTRIBUTION


One reason that the normal distribution is important is that many large sets of data
are rather closely approximated by a normal curve. It is said then that the data
are “normally distributed.” We expect men’s heights, women’s weights, the blood
pressure of young adults, and cholesterol measurements to be approximately normally
distributed. When they are, the normal tables of areas are useful in studying them. It
should be realized, however, that many large sets of data are far from being normally
distributed. Age at death cannot be expected to be normally distributed, no matter
how large the set of data. Similarly, data on income cannot be expected to be normally
distributed.
Besides the fact that many sets of data are fairly well approximated by a normal
distribution, the normal distribution is important for another reason. For any popula-
tion, if we choose a large sample size, draw all possible samples of that particular size
from the population, and compute the mean for each sample, the sample means them-
selves are approximately normally distributed. We use this a great deal in Chapter 7;
at that time it will be clearer just why this makes the normal distribution so impor-
tant. No matter how peculiar the distribution of the population, in general (under
restrictions so mild that they need not concern us here), the means of large samples
are approximately normally distributed.
From Section 5.3 we know that the mean of a population of sample means is the
same as the mean of the original population, whereas its standard deviation is equal
to that of the original population standard deviation of the observations divided by
~‘6, where n is the sample size. For n reasonably large, we know that the means
are approximately normally distributed. Because a normal distribution is completely
specified by its mean and its standard deviation, for large samples everything is known
about the distribution of sample means provided that p and cr are known for the

Free download pdf