Let M be the desired maximum margin of error. Then,
Solving for n ,
But we do not have a value of until we collect data, so we need a way to estimate . Let P * =
estimated value of . Then
There are two ways to choose a value of P *:
Use a previously determined value of . That is, you may already have an idea, based on historical
data, about what the value should be close to.
- Use P * = 0.5. A result from calculus tells us that the expression
achieves its maximum value when P * = 0.5. Thus, n will be at its maximum if P * = 0.5. If P * =
0.5, the formula for n can more easily be expressed as
It is in your interest to choose the smallest value of n that will match your goals, so any value of P
* < 0.5 would be preferable if you have some justification for it.
example: Historically, about 60% of a company’s products are purchased by people who have
purchased products from the company previously. The company is preparing to introduce a new
product and wants to generate a 95% confidence interval for the proportion of its current
customers who will purchase the new product. They want to be accurate within 3%. How many
customers do they need to sample?
solution: Based on historical data, choose P * = 0.6. Then
The company needs to sample 1025 customers. Had it not had the historical data, it would have
had to use P * = 0.5.