Statistical Analysis for Education and Psychology Researchers

(Jeff_L) #1

  • Observations are compared which have been selected at random from an underlying
    theoretically continuous distribution, but measurement is at least at the ordinal level
    (ranks).

  • Observations are from two independent samples.

  • The two distributions compared should have similar variances.

  • Tied scores (after ranking) are given the average of the ranks they would have had if no
    ties had occurred. A small number of ties have little effect on the test statistic SR,
    however when the proportion of ties is large and, in particular when sample sizes are
    small, the test statistic SR tends to be conservative (p-values are inflated) and a
    correction for ties should be applied. The effect of tied values is to reduce the standard
    error of the test statistic SR leading to an overall increase in the value of Z.

  • With small sample sizes, n<20, a correction for continuity should be used, most
    statistical analysis programmes automatically apply a continuity correction.


Example from the Literature

Given that ranking methods are particularly useful in many educational settings, where
measurement can only reasonably be made at the ordinal level, it is surprising that the
Wilcoxon M-W test or the comparable Mann Whitney U test are not more widely used by
educational researchers.
In a study on the place of alcohol education and its (implied) effectiveness Regis, Bish
and Balding (1994) compared the alcohol consumption (units of alcohol) of 10-year-old
self-declared drinkers for a number of independent groups of pupils. Four comparisons
amongst independent groups were made of which two are illustrated: i) comparison of
alcohol consumption among pupils where alcohol education was delivered through
science vs. pupils where it was not delivered through science and ii) comparison of
alcohol consumption among pupils where alcohol education was delivered through
personal and social education (PSE) vs. pupils where it was not delivered through PSE.
Data for these comparisons amongst girls is shown below:
Comparison group Alcohol education delivered Alcohol education not delivered^
(10-year-old girls) Mean Rank Score Mean Rank Score p-value
Science 298.74 253.77 0.0009
PSE 281.84 260.51 0.2237


The investigators reported that Mann Whitney U tests were used (directly comparable
with Wilcoxon M-W test) as the statistical test for detecting differences between groups.
They go on to say that a nonparametric test was used because it made fewest assumptions
about the underlying population distribution and the nature of the variables.
Given the non-normal distribution of the data reported by the investigators (skewed
distribution of alcohol consumption attributable to outlier observations) and the
comparisons among independent groups then the nonparametric equivalent to a t-test,
either a Wilcoxon M-W test or Mann Whitney U test, is appropriate provided the
distributions to be compared have similar dispersion. The investigators do not provide
any information about the dispersion of observations in the data set.
The null hypothesis tested by the investigators was that the distribution of units of
alcohol consumed by 10-year-old girls was the same amongst two groups of pupils; one


Statistical analysis for education and psychology researchers 220
Free download pdf