The Essentials of Biostatistics for Physicians, Nurses, and Clinicians

(Ann) #1
2.1 Defi nitions of Populations and Samples 17

A subsequent analysis of their sampling method indicated that
the original mailing list of 10 million was based primarily on tele-
phone directories and motor vehicle registration lists. In modern
times, such a sampling method would be acceptable, since the percent-
age of eligible voters that have telephones and drivers licenses is
nearly 100%.
But, in 1936, the United States was recovering from the great
depression, and telephones and automobiles were a luxury. So a large
majority of the people with telephones and/or cars were affl uent. The
affl uent Americans tended to be Republicans, and were much more
likely to vote for Landon than the Democrats, many of whom were
excluded because of this sampling mechanism. As poor and middle -
income Americans represented a much larger portion of American
society in 1936, and they would be more likely to vote for Roosevelt,
this created a large bias that was not recognized by those individuals
at the Literary Digest who were conducting the survey. This shows that
samples not chosen at random may appear on the surface to be like a
random sample, but could have a large enough bias to get the prediction
wrong. If a truly random sample of 2.3 million registered voters likely
to vote were selected and the true proportion that would vote for
Roosevelt were 62%, then it would be nearly impossible for the survey
to pick Landon.


2.1 DEFINITIONS OF POPULATIONS


AND SAMPLES


At this stage, we have informally discussed populations and samples.
Now as we get into the details of random samples and other types
of sampling methods, we will be more formal. The term population
refers to a collection of people, animals, or objects that we are inter-
ested in studying. Usually, there is some common characteristic
about this population that interests us. For example, the population
could be the set of all Americans having type II diabetes. A sample
would be a subset of this population that is used to draw inferences
about the population. In this example, we might have a drug like
metformin that we think will control the sugar levels for these
patients. There may be millions of Americans that have type II
diabetes.

Free download pdf