Basic Statistics

(Barry) #1
16 POPULATIONS AND SAMPLES

be either the same fraction from both strata (say 10%) or different fractions. If the
same proportion of students is taken from the male and female strata, the investigator
has at least made sure that some female students will be drawn. The investigator may
decide to sample a higher proportion of female students than male students in order
to increase the number of females available for analysis. The latter type of sample is
called a disproportionate stratGed sample.
Another method of sampling is called systematic sampling. To obtain a systematic
sample of size 400 from a population of 4000 undergraduates, one begins with a
list of the undergraduates numbered from 1 to 4000. Next, the sampling interval
k = 4000/400, or 10, is computed. Then, a random number between 1 and k (in this
case between 1 and 10) is chosen. Note that methods for choosing random numbers
are given in Section 2.3. Suppose that the random number turns out to be 4. Then
the first freshman chosen is number 4. Then every kth or 10th freshman is chosen.
In this example, number 4 is the first freshman, 4 + 10 = 14 is the second freshman
chosen, 14 + 10 = 24 is the third freshman chosen, and so on.
There are several advantages to using systematic samples. They are easy to carry
out and readily acceptable to staff and investigators. They work well when there is
a time ordering for entry of the observational units. For example, sampling every
kth patient entering a clinic (after a random start) is a straightforward process. This
method of sampling is also often used when sampling from production lines. Note
that in these last two examples we did not have a list of all the observational units in
the population. Systematic samples have the advantage of spreading the sample out
evenly over the population and sometimes over time.
Systematic sampling also has disadvantages. One of major disadvantages is a
theoretical one. In Section 4.2 we define a measure of variation called the variance.
Theoretically, there is a problem in estimating the variance from systematic samples,
but in practice, most investigators ignore this problem and compute the variance
as if a simple random sample had been taken. When a periodic trend exists in the
population, it is possible to obtain a poor sample if your sampling interval corresponds
to the periodic interval. For example, if you were sampling daily sales of prescription
drugs at a drugstore and sales are higher on weekends than on weekdays, then if
your sampling interval were 7days, you would either miss weekends or get only
weekends. If it can be assumed that the population is randomly ordered, this problem
does not exist. Another disadvantage of systematic samples in medical studies is that
sometimes the medical personel who enter the patient into the study can learn what
treatment the next patient who enters the study will be assigned, and if they think one
treatment is better than another, they may try to affect the selection process.
For a more complete discussion of these and other types of samples, see Kalton
[1983], Barnett [1994], Levy and Lemeshow [1999], Scheaffer et al. [2006], or for
very detailed discussions, Kish [1965]. Many types of samples are in common use; in
this book it is always assumed that we are studying a simple random sample. That is,
the formulas will only be given assuming that a simple random sample has been taken.
These formulas are the ones that are in common use and are used by investigators for
a variety of types of samples.

Free download pdf