Advanced High-School Mathematics

(Tina Meador) #1

390 CHAPTER 6 Inferential Statistics


that a Gallop Poll survey of 10,000 voters led to the prediction that
51% of the American voters preferred Kerry over Bush with a sam-
pling error of±3% and a confidence level of 95%. What this means, of
course, is the essence of confidence intervals for proportions.


The methods of this section are based on the assumption that large
enough samples from an even larger binomial population are taken so
that the test statistic—the sample proportion—can assumed to be nor-
mally distributed. Thus, we are going to be sampling from a very large
binomial population, i.e., one with exactly two types A and B. If the
population size isN, then thepopulation proportionpcan be then
defined to be fraction of those of type A toN. When sampling from
this population, we need for the population size to be rather large com-
pared with the sample size. In practice, the sampling is typically done
without replacement which strictly speaking would lead to a hypergeo-
metric distribution. However, if the population size is much larger than
the sample size, then the samples can be regarded as independent of
each other, whether or not the sampling is done without replacement.
Once a sample of sizenhas been taken, thesample proportionpˆis
the statistic measuring the ratio of type A selected to the sample size
n.


Assume, then, that we have a large population wherepis the pro-
portion of type A members. Each time we randomly select a member
of this population, we have sampled aBernoulli random variableB
whose mean ispand whose variance isp(1−p). By the Central Limit
Theorem, whennis large, the sumB 1 +B 2 +···+Bnofnindependent
Bernoulli random variables, each having meanpand variancep(1−p)
has approximately a normal distribution with mean np and variance
np(1−p). The random variable


P̂ = B^1 +B^2 +···+Bn
n

is therefore approxmately normally distributed (when n is large) and
has meanpand variance p(1n−p).

Free download pdf