14 POPULATIONS AND SAMPLES
the group of 100 people is called a sample; the group from which the sample is taken
is called the population (another word for the same concept is universe).
As an example from public health, consider a epidemiologist who wishes to study
if nonsmoking students at a certain university exercise more than students who smoke.
Here the population is all undergraduate students; possibly there are 4000 of them. The
epidemiologist can take a sample of 400 students from student records and interview
these students about their smoking status and the amount of exercise they participate
in instead of interviewing all 4000 students. The 400 students are a sample from a
population of 4000 students.
Each student in the sample is sometimes called an observational unit. More
commonly, if they are people, they are referred to as individuals, cases, patients, or
subjects. Populations are usually described in terms of their observational units, extent
(coverage), and time. For example, for the population of students, the observational
units are undergraduates, the extent is, say, Midwest University, and the time is fall
- Careful definition of the population is essential for selection of the sample.
The information that is measured or obtained from each undergraduate in the
sample is referred to as the variables. In this example, smoking status and amount of
exercise are the two variables being measured.
In the two examples just given, the population from which the samples are taken
is the population that the investigator wishes to learn about. But in some cases, the
investigator expects that the results will apply to a larger population often called
a targetpopulation. Consider a physician who wishes to evaluate a new treatment for
a certain illness. The new treatment has been given by the physician to 100 patients
who are a sample from the patients seen during the current year. The physician is
not primarily interested in the effects of the treatment on these 100 patients or on the
patients treated during the current year, but rather in how good the treatment might
be for any patient with the same medical condition. The population of interest, then,
consists of all patients who might have this particular illness and be given the new
treatment. Here, the target population is a figment of the imagination; it does not exist.
and indeed the sample of 100 patients who have been given this treatment may be the
only ones who will ever receive it. Nevertheless, this hypothetical target population is
actually the one of interest, since the physician wishes to evaluate the treatment as if it
applies to patients with the same illness. We might use the following definitions: The
target population is the set of patients or observational units that one wishes to study;
the population is what we sample from; and the sample is a subset of the population
and is the set of patients or observational units that one actually does study.
Questions arise immediately. Why do we study a sample rather than the entire
population? If we desire information concerning an entire population, why gather
the information from just a sample? Often the population is so large that it would be
virtually impossible to study the entire population; if possible, it may be too costly
in time and money. If the target population consists of all possible patients suffering
from a certain illness and given a certain treatment, then no matter how many patients
are studied, they must still be considered to be a sample from a very much larger
population.