Basic Statistics

(Barry) #1
18 POPULATIONS AND SAMPLES

Table 2.1 Blood Cholesterol Measurements for 98 Men Aged 40-49Years (mg/100 mL)


289 385 306
215 301 249
368 291 249
251 256 294
327 195 305
282 311 193
268 251 333
322 381 276
280 411 195
232 293 285

218
288
300
253
253
242
300
205
256
250

251 281 241
331 263 260
268 283 319
221 241 372
251 229 250
304 210 277
250 234 264
251 210 254
387 241 245
260 316 352

224
228
284
339
348
312
29 1
299
325
309

198
190
205
292
280
264
21 1
213
289

281
282
294
294
378
262
284
252
306

replacing) a tag from a box containing 10 tags, one marked with 0, one marked with
1, and so on, the last tag being marked with 9. Considerable effort goes into the
production of such a table, although one’s first thought might be that all that would be
necessary would be to sit down and write down digits as they entered one’s head. A
trial of this method shows quickly that such a list cannot be called random: A person
invariably has particular tendencies; he or she may write down many odd digits or
repeat certain sequences too often.
To illustrate the use of such a table of random digits, a sample of size 5 will be
drawn from the population of 98 blood cholesterols. First, the list of the population
is made; here it is convenient to number in rows, so that number 1 is 289, number 2
is 385, number 50 is 378, and number 98 is 309. Since there are 98 measurements in
the population, for a sample of size 5 we need five 2-digit numbers to designate the 5
measurements being chosen. With 2 digits we can sample a population whose size is
up to 99 observations. With 3 digits we could sample a population whose size is up
to 999 observations. To obtain five 2-digit numbers, we first select a starting digit in
Table A. 1 and from the starting digit we write down the digits as they occur in normal
reading order. Any procedure for selecting a starting value and proceeding is correct
as long as we choose a procedure that is completely independent of the values we see
in Table A. 1. For an example that is easy to follow, we could select the first page from
Table A.l and select pairs of numbers starting with the 2-digit number in the upper
left hand corner that has a value of 10. We proceed down the column of pairs of digits
and take 37, 8, 99, and 12. But we have only 98 observations in our population, so
the 99 cannot be used and the next number, 12, should be used instead for the fourth
observation. Since we need an additional number, we take the next number down,



  1. The final list of 5 observations to be sampled are 10, 37, 8, 12, and 66.
    Referring back to Table 2.1, the cholesterol measurement corresponding to the
    tenth observation is 287 since we are numbering by rows. In order, the remaining 4
    observations are 372 for observation 37, 224 for observation 8, 301 for observation
    12, and 234 for observation 66.
    Statistical programs can also be used to generate random numbers. Here the
    numbers will often range between 0 and 1, so the decimal point should be ignored

Free download pdf