Basic Statistics

(Barry) #1
IMPORTANCE OFTHE NORMAL DISTRIBUTION 71

Figure 6.8 Distribution of means from samples of size 25.

population. We need to know nothing about the shape of the population distribution
of observations.
This remarkable fact will be useful in succeeding chapters. At present, we can use
it to answer the following question. If it is known that the mean cost of a medical
procedure is $5000 with a standard deviation of $1000 (note that both the mean and
standard deviation are considered population parameters), and we think of all possible
samples of size 25 and their sample means, what proportion of these sample means
will be between $4600 and $5400? Or, alternatively, what is the probability that a
sample mean lies between $4600 and $5400?
The population of x’s from samples of size 25 is approximately normally dis-
tributed. From Chapter 5 the mean py is equal to $5000 (since py = p), and a~,
the standard deviation of the distribution, equals $lOOO/fl = $1000/5 = $200
(since ax = a/+). Figure 6.8 shows the distribution for the population of x, and
the area of the shaded portion is equal to the proportion of the means between $4600
and $5400.
As usual, we make the transformation to z in order to be able to apply the normal
tables. Now, however, we have z = (x - px)/q, as the mean and standard
deviation of the x distribution must be used. To find z at x = $5400, we have z =
($5400 - $5000)/200 = 2. From Table A.2, the area to the left of z = 2 is .9772. The
area to the right of z = 2 must, by subtraction, equal .0228, and by symmetry, the
area below x = $4600 is .0228. Subtracting the two areas from 1,0000, we obtain


1.0000 - 2(.0228) = 1.0000 - .0456 = .9544


Thus 95.44% of the sample means, for samples of size 25, lie between $4600 and
$5400, and the probability that a sample mean is between $4600 and $5400 is .9544.
How large should a sample be to be called reasonably large? Twenty-five or larger
is as satisfactory an answer as possible. The answer to this question must depend,
however, on the answers to two other questions. First, how close to normality do
we insist that the distribution of sample means be? Second, what is the shape of the
distribution of the original population? If the original population is far from normal,
the sample size should be larger than if it is from a population closer to normality.
However, in assuming that samples of size 25 or greater have means that are normally
distributed, we make such a small error that in most work it can be disregarded.

Free download pdf