Introductory Biostatistics

(Chris Devlin) #1

having a disease is the disease prevalence. For another example, suppose that
out ofN¼ 100 ;000 persons of a certain target population, a total of 5500 are
positive reactors to a certain screening test; then the probability of being posi-
tive, denoted by PrðpositiveÞ,is


PrðpositiveÞ¼

5500


100 ; 000


¼ 0 :055 or 5:5%

A probability is thus a descriptive measure for a target population with
respect to a certain event of interest. It is a number between 0 and 1 (or zero
and 100%); the larger the number, the larger the subpopulation. For the case of
continuous measurement, we have the probability of being within a certain
interval. For example, the probability of a serum cholesterol level between 180
and 210 (mg/100 mL) is the proportion of people in a certain target population
who have cholesterol levels falling between 180 and 210 (mg/100 mL). This is
measured, in the context of the histogram of Chapter 2, by the area of a rect-
angular bar for the class (180–210). Now of critical importance in the inter-
pretation of probability is the concept of random sampling so as to associate
the concept of probability with uncertainty and chance.
Let the size of the target population beN(usually, a very large number), a
sample is any subset—say,nin numberðn<NÞ—of the target population.
Simple random sampling from the target population is sampling so that every
possible sample of sizenhas an equal chance of selection. For simple random
sampling:



  1. Each individual draw is uncertain with respect to any event or character-
    istic under investigation (e.g., having a disease),but

  2. In repeated sampling from the population, the accumulated long-run rel-
    ative frequency with which the event occurs is the population relative
    frequency of the event.


The physical process of random sampling can be carried out as follows (or in
a fashion logically equivalent to the following steps).



  1. A list of allNsubjects in the population is obtained. Such a list is termed
    aframeof the population. The subjects are thus available to an arbitrary
    numbering (e.g., from 000 toN¼999). The frame is often based on a
    directory (telephone, city, etc.) or on hospital records.

  2. A tag is prepared for each subject carrying a number 1; 2 ;...;N.

  3. The tags are placed in a receptacle (e.g., a box) and mixed thoroughly.

  4. A tag is drawn blindly. The number on the tag then identifies the subject
    from the population; this subject becomes a member of the sample.


110 PROBABILITY AND PROBABILITY MODELS

Free download pdf