Basic Statistics

(Barry) #1
THE WILCOXON-MANN-WHITNEY TEST 195

13.3 THE WILCOXON-MANN-WHITNEY TEST

Two rank tests were developed independently by Wilcoxon and Mann-Whitney to test
the null hypothesis that two independent samples had the same distribution against
the alternative hypothesis that one of the distributions is less than the other (one-sided
test) or that the two populations differ from each other (two-sided test). This Wilcoxon
is called the Wilcoxon rank sum test to distinguish it from the Wilcoxon test for paired
data which is called the Wilcoxon signed ranks test. The Mann-Whitney test often
called the Mann-Whitney U test, is also available, and both tests are often available in
software packages. The two tests are often written together as the Wilcoxon-Mann-
Whitney (WMW) test, as the results of the tests will be the same unless there are ties.
These tests are often used instead of Student’s t test when the data are not normally
distributed and there is uncertainty about how to choose a suitable transformation to
achieve a normal distribution or the data are ordinal. We will only give the formulas
for the Wilcoxon rank sum test since the computation is simpler for this test (van
Belle et al. [2004]). Formulas for the Mann-Whitney test can be found in Connover
[1999], Sprent and Smeeton [2007], Gibbons [1993], Daniel [1978], and numerous
other texts.


13.3.1 Wilcoxon Rank Sum Test for Large Samples

The assumptions made when performing the Wilcoxon rank sum test include:


  1. The data are from a random sample from the two distributions.

  2. The samples are independent.

  3. The data being tested are at least ordinal.
    The null hypothesis that is being tested is that the two samples come from identi-
    cal distributions, and the alternative hypothesis is that one distribution has larger or
    smaller values than the other or that the population medians are different.
    First, we illustrate the large-sample approximation and then note the information
    needed to obtain results from a table when the sample sizes are small. Van Belle et
    al. [2004] recommend that the sample sizes be at least 15 for both groups to use the
    following normal approximation. In this example, we do not meet this recommenda-
    tion.
    The first step in obtaining the large-sample approximation results is to rank both
    samples together from small to large. The data in Table 13.2 are hypothetical choles-
    terol levels from two samples of patients. The first sample is n = 11 males who have
    taken part in their employer’s health promotion plan (HPP), and the second sample
    is m = 8 male employees who did not take part in the plan. Note that n is used
    for the sample with the most observations in it and m for the sample with the fewest
    observations. The results for the 11 participating employees are 126, 138, 158, 158,
    165, 171, 176, 177, 180, 188, and 195 mg/dL, and the results for the other 8 nonpar-
    ticipating employees, denoted by No, are 148, 152, 175, 189, 197,200,204, and 213
    mg/dL.

Free download pdf