The Essentials of Biostatistics for Physicians, Nurses, and Clinicians

(Ann) #1
28 CHAPTER 2 Sampling from Populations

households in a particular city, the city could be divided up into blocks.
A subset of the city blocks is selected at random, and each household
on the block is included in the sample. Cluster sampling is a conve-
nient and economic way for organizations such as the U. S. Census
Bureau to conduct surveys. So, for example, we may look at residents
of Manhattan, New York as the population. Every city block in
Manhattan is eligible for selection, and a random sample of city blocks
is taken, and every household on the chosen blocks are included.
Convenience sampling and systematic sampling are both nonran-
dom methods and are not recommended in general. In special cases,
these methods may work, but often they don ’ t. Systematic sampling
can be used (but not necessarily recommended) when an ordered list
of the population members is available. Samples are chosen by a sys-
tematic algorithm. For example, if the population size n = 500, and we
want a sample of size 100, we can choose every fi fth case on the list,
such as those with indices 1, 6, 11, 16, 21, 26, 31,... , 491, and 496.
This is not the only way if we skip 1; we can accomplish the sample
choosing 2, 7, 12, 17... , 492, and 497. We could also start with the
third, fourth, or fi fth index in the sequence. If we start with 5, the
sequence is 5, 10, 15, 20, 25,... , 495, and 500.
Systematic sampling can work if the ordering has no relationship
to the value of the outcome variable. A case where systematic sampling
can fail is when the outcomes are cyclical in time. For instance, if the
pattern is sinusoidal and the period is 5 units, then we could be sam-
pling at the peaks of the cycle when we pick every fi fth case in sequence
and the fi rst case is a peak. This would lead to a positive bias in the
estimate for the outcome variable ’ s mean. On the other hand, starting
at a trough would create a negative bias on the estimate of the outcome
variable ’ s mean.
Convenience sampling only means that you fi nd a sample of size
n out of the population of size N in a simple and convenient way. There
is no way to draw inference from such a sample. Convenience sampling
should never be recommended.


2.5 GENERATING BOOTSTRAP SAMPLES


Bootstrap sampling is simple random sampling from the observed data
(also called the empirical distribution). It amounts to sampling with

Free download pdf