Basic Statistics

(Barry) #1
190 NONPARAMETRIC STATISTICS

In Section 13.1 the sign test for large and small samples is given. The sign test is
often used with large samples when the user wants to make only a few assumptions.
The sign test is used when the data are paired. In Section 13.2 the Wilcoxon signed
ranks test is given. This test is also used for paired data. In Section 13.3 the Wilcoxon
sum of ranks test for two independent samples is given. The Wilcoxon sum of ranks
test and the Mann-Whitney test were developed independently. They can both be
used for testing two independent sets of data and the results of these two tests are the
same for large samples when there are no ties. In many texts the authors call these
tests the Wilcoxon-Mann-Whitney (WMW) test. The Wilcoxon sum of ranks test is
often used instead of Student’s t test when the data cannot be assumed to be normally
distributed. The Wilcoxon test can also be used with ordinal data, which is not true
for the t test. The data can be plotted using the graphical methods recommended
in Section 5.4.3, and this should be considered, especially for the Wilcoxon sum of
ranks test data. In Section 13.4, Spearman’s rank correlation, which can be used on
data after it is ranked, is described. Spearman’s rank correlation is less sensitive to
outliers than the usual correlation coefficient described in Section 12.3.
The tests described in this chapter are also often used on interval and ratio data when
the data do not follow a normal distribution and there is no obvious transformation
to use to achieve normality. The nonparametric tests can also be used on normally
distributed data, but in that case they will tend to be slightly less powerful than a test
that assumes normality.


13.1 THE SIGN TEST

The sign test has been around for a long time and is a simple test to perform. When the
sample size is large, say greater than 20, the normal approximation to the binomial
given in Section 10.3.2 can be used on paired samples. In performing this test we
assume that the pairs are independent and the measurements are at least ordinal data.

13.1.1 Sign Test for Large Samples

Suppose that company A introduces a health program to their 40 workers. They then
obtain the number of days taken off work for medical reasons in 2006, which was
prior to their health program, and also in 2008, after the program was in full swing.
The results Xi, Yi for the ith worker consists of paired data, where Xi is the days off
work in 2006 and Y, is the days off work in 2008. The data are paired since the days
off work are from the same employee. The results were that 28 workers had fewer
days off after the health plan went into effect, four workers had the same number of
days off work, and eight workers had more days off work in 2008 than they did in



  1. The four workers who had the same number of days off work before and after
    the health plan are called ties and their results are not used, so the sample size is now

  2. If the health plan had no effect, we would expect the same number of workers
    among the 36 to take days off in 2008 as in 2006.
    Let the workers who had fewer days off be considered successes, and those taking
    more days off after the health plan, failures. The null hypothesis is that the proportion

Free download pdf