having a disease is the disease prevalence. For another example, suppose that
out ofN¼ 100 ;000 persons of a certain target population, a total of 5500 are
positive reactors to a certain screening test; then the probability of being posi-
tive, denoted by PrðpositiveÞ,is
PrðpositiveÞ¼
5500
100 ; 000
¼ 0 :055 or 5:5%
A probability is thus a descriptive measure for a target population with
respect to a certain event of interest. It is a number between 0 and 1 (or zero
and 100%); the larger the number, the larger the subpopulation. For the case of
continuous measurement, we have the probability of being within a certain
interval. For example, the probability of a serum cholesterol level between 180
and 210 (mg/100 mL) is the proportion of people in a certain target population
who have cholesterol levels falling between 180 and 210 (mg/100 mL). This is
measured, in the context of the histogram of Chapter 2, by the area of a rect-
angular bar for the class (180–210). Now of critical importance in the inter-
pretation of probability is the concept of random sampling so as to associate
the concept of probability with uncertainty and chance.
Let the size of the target population beN(usually, a very large number), a
sample is any subset—say,nin numberðn<NÞ—of the target population.
Simple random sampling from the target population is sampling so that every
possible sample of sizenhas an equal chance of selection. For simple random
sampling:
- Each individual draw is uncertain with respect to any event or character-
istic under investigation (e.g., having a disease),but - In repeated sampling from the population, the accumulated long-run rel-
ative frequency with which the event occurs is the population relative
frequency of the event.
The physical process of random sampling can be carried out as follows (or in
a fashion logically equivalent to the following steps).
- A list of allNsubjects in the population is obtained. Such a list is termed
aframeof the population. The subjects are thus available to an arbitrary
numbering (e.g., from 000 toN¼999). The frame is often based on a
directory (telephone, city, etc.) or on hospital records. - A tag is prepared for each subject carrying a number 1; 2 ;...;N.
- The tags are placed in a receptacle (e.g., a box) and mixed thoroughly.
- A tag is drawn blindly. The number on the tag then identifies the subject
from the population; this subject becomes a member of the sample.
110 PROBABILITY AND PROBABILITY MODELS