Basic Statistics

(Barry) #1
130 CATEGORICAL DATA: PROPORTIONS

about 86.64% of the sample proportions lie between .08 and .32. Thus 86.64% of
sample proportions from samples of size 25 will be from .08 to .32 inclusive. The
exact answer using tables of the binomial distribution is .9258; in this borderline
situation with n~ = 5, the normal approximation is not very close.


10.3.2 Continuity Correction

The normal approximation may be improved by using what is called a continuity
correction. In the transformation to z, for the upper tail, a continuity correction of
1/2n is added to p. Since n = 25, the amount of the correction factor is 1/50, or .02.
For the lower tail, the same continuity correction is subtracted fromp. In the example
of the previous paragraphs,


(.32 + .02) - .2 .14
z= - - - = 1.75
.08 .08
From Table A.2, about .9599 of sample proportions lie < .34. In this example, this
is equivalent to saying that .9599 of the sample proportions are 5 .32, inasmuch as
it is impossible with a sample of size 25 to obtain a p between .32 and .36. Then,
1 - .9599 = ,0401, or 4.01% of the sample proportions are > .32 and by symmetry
4.01% of the sample proportions are < .08. So 91.98% of the sample proportions
lie from .08 to .32 inclusive. Note that using the continuity correction factor resulted
in an area closer to .9258, which was the exact answer from a table of the binomial
distribution.
In general, the formula can be written as

(p - T) & 1/2n
z= dw

If (p - T) is positive, the plus sign is used. If (p - n) is negative, the minus sign is
used. As n increases the size of the continuity correction decreases and it is seldom
used for large n. For example, for n = 100, the correction factor is 1/200 or .005,
and even for n = 50, it is only .01.

10.4 CONFIDENCE INTERVALS FOR A SINGLE POPULATION
PROPORTION

Suppose that we wish to estimate the proportion of patients given a certain treatment
who recover. If in a group of 50 patients assigned the treatment, 35 of the 50 recover,
the best estimate for the population proportion who recover is 35/50 = .70, the
sample proportion. We then compute a confidence interval in an attempt to show
where the population proportion may actually lie.
The confidence interval is constructed in much the same way as it was constructed
for p, the population mean, in Section 7.1.2. The sample proportionp is approximately
normally distributed, with mean T and with standard deviation op = dm,
Free download pdf