Introductory Biostatistics

(Chris Devlin) #1

For example, a study was designed to test the question of whether cigarette
smoking is associated with reduced serum-testosterone levels. To carry out this
research objective, two samples, each of size 10, are selected independently. The
first sample consists of 10 nonsmokers who have never smoked, and the second
sample consists of 10 heavy smokers, defined as those who smoke 30 or more
cigarettes a day. To perform the Wilcoxon rank-sum test, we combine the two
samples into one large sample (of size 20), arrange the observations from
smallest to largest, and assign a rank, from 1 to 20, to each. If there are tied
observations, we assign an average rank to all measurements with the same
value. For example, if the two observations next to the third smallest are equal,
we assign an average rank ofð 4 þ 5 Þ= 2 ¼ 4 :5 to each one. The next step is to
find the sum of the ranks corresponding to each of the original samples. Letn 1
andn 2 be the two sample sizes andRbe the sum of the ranks from the sample
with sizen 1.
Under the null hypothesis that the two underlying populations have identi-
cal medians, we would expect the averages of ranks to be approximately equal.
We test this hypothesis by calculating the statistic



RmR
sR

where


mR¼

n 1 ðn 1 þn 2 þ 1 Þ
2

is the mean and


sR¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
n 1 n 2 ðn 1 þn 2 þ 1 Þ
12

r

is the standard deviation ofR. It does not make any di¤erence which rank sum
we use. For relatively large values ofn 1 andn 2 (say, both greater than or equal
to 10), the sampling distribution of this statistic is approximately standard
normal. The null hypothesis is rejected at the 5% level, against a two-sided
alternative, if


z< 1 :96 or z> 1 : 96

Example 7.8 For the study on cigarette smoking above, Table 7.4 shows the
raw data, where testosterone levels were measured inmg/dL and the ranks were
determined. The sum of the ranks for group 1 (nonsmokers) is


R¼ 143


258 COMPARISON OF POPULATION MEANS

Free download pdf