Statistical Analysis for Education and Psychology Researchers

(Jeff_L) #1

particular if sample sizes are small, then the variance of the test statistic SR should be
corrected for ties,
The Wilcoxon M-W test is also useful for post hoc analysis following a nonparametric
one-way analysis of variance although there are more sophisticated post hoc procedures
(see Keselman and Rogan, 1977; Marascuilo and Dagenais, 1982). A statistically
equivalent test to the Wilcoxon Mann-Whitney procedure is the Mann-Whitney U test.
This is illustrated in many introductory statistical textbooks. There is a perfect linear


relationship between the two test statistics, where U is
the Mann-Whitney U test statistic.


Statistical Inference and Null Hypothesis

The null hypothesis tested is that the two random samples are from one population, that is
there is no difference in the rank order values found in the two data distributions being
compared. Rejection of the null hypothesis is usually interpreted as meaning that the two
distributions represent two different populations which have different distributions. When
the shape and dispersion of the two distributions is similar it is a test of difference in
population medians. The alternative hypothesis may be directional (a one-tailed test), for
example, the majority of larger rank scores are found in one sample and this sample
would have a larger mean rank score, or nondirectional, for example, this simply states
that the two sample distributions of rank scores are different. The test statistic, SR, is the
rank sum for the sample (group) which has the smallest sample size. With small sample
sizes, <10, this test statistic, SR, has an exact sampling distribution, however SR rapidly
approaches a normal distribution as the sample size of one or both of the groups
increases—for sample sizes ≥20.
The Wilcoxon M-W test is based on the idea that if there are two populations and not
one (i.e., H 0 is false) the rank order scores in one sample will generally be larger than the
rank scores in the other sample. This difference, that is higher ranking scores found
mostly in one sample, could be detected by ranking all scores irrespective of what group
they belong to and then summing the rank scores according to group membership. If H 0 is
true, we would expect the rank scores to be similarly represented in both samples
(groups) and the average ranks in each of the two groups to be about equal. We would not
reject the null hypothesis and conclude that there is no difference in the two distributions
being compared. If the two samples were different, that is having come from two distinct
populations, then we would expect higher (or lower) rank sum totals (allowing for
differences in sample size) in one of the samples. The sampling distribution of the rank
sum SR is known and hence the probability associated with extreme values of the test
statistic. Given regard to the sample sizes of the two groups, and whether a one-tailed or
two-tailed test is used, the probability associated with an observed value of SR can be
determined.


Test Assumptions

The test assumptions are as follows:


Inferences involving rank data 219
Free download pdf