Basic Statistics

(Barry) #1
DATA ENTRY AND ANALYSIS USING STATISTICAL PROGRAMS 137

An approximate estimate of the needed sample size, n, in each group is computed
from the following formula:


[2[1-cy/2]p(i=7ij+z[l -p]J7h(l-7r1)+7r2(1-7-r2)]2
(m - .2)2

n=

Substituting in the numbers for the example,

[1.96dw + .8424.12(.88) + .20(.80)]*
n=
(.12 - .20)2
or

= 328.6


[1.96- + .842-l2 - [1.01618 + .43393612
n= -
.0064 .0064

which is rounded up to n = 329 observations in each treatment group. For small n,
an approximate continuity correction factor is often added so that the corrected n’ is


2
n’=n+
IT1 - 7r2I

where 17rl - 7r2 1 denotes the positive difference between TI and TZ. In our example,


2
.08

12’ = 329 + - = 329 + 25 = 351


One source of confusion in estimating sample size for differences in proportions dur-
ing the planning stage of the study is that there are several slightly different formulas
in use as well as different correction factors. In the main, they lead to similar esti-
mates for n, but there is not precise agreement between various tables and statistical
programs. For a one-sided test with cy = .05, we use z[l - cy], or 1.645.

10.8 DATA ENTRY AND ANALYSIS USING STATISTICAL
PROGRAMS

Data entry of categorical data is simplified if a consistant numbering system is used
for coding the data. Missing values should be coded as given in the instructions for
the particular program used: for example, if we could code success as a success = 1
and failure as failure = 0 for each question.
In general, most computer programs do not calculate the confidence limits or tests
of hypothesis for a single proportion described in this chapter. They do, however,
provide for a count of the number of observations that occur in each category of
categorical variables, and then compute proportions or percents. They will also make
the graphical displays described at the beginning of this chapter.
Minitab will compute the exact value of k successes from the binomial distribution.
Stata will compute exact confidence limits. In SAS, the PROC FREQ and TABLES

Free download pdf