There are 21 Bs, 14Gs and 16 runs, as n 1 >20 the large sample approximation can be used.
The researcher chooses a two-sample test and an alpha of 0.05. The null hypothesis is
tested using the normal approximation for the sampling distribution of U. When either n 1
or n 2 ≥20 the probability associated with an observed U is evaluated using formula 7.2
which gives a normal Z deviate:
Normal
approxim
ation for
the
sampling
distributi
on of U—
7.2
where N is the total sample size, U is the number of runs, n 1 and n 2 are the frequencies for
the two categories of the response variable and j is an adjustment for continuity where it
is 0.5 if U<2n 1 n 2 /(N+1) or −0.5 if U>2n 1 n 2 /(N+1). The calculated value of Z is −0.465
(−1.3/2.7941).
Interpretation
Since Z is not greater than the two-tailed critical value, +/−1.96, the null hypothesis of
randomness cannot be rejected at the 5 per cent level. The associated probability of
obtaining a Z value of −0.465 when the null hypothesis is true is p=0.638. As the normal
distribution is symmetrical, a Z value of −0.465 has the same associated probability as a Z
value of +0.465. From Table 1 of Appendix A4 which indicates the proportion of the
total area under the normal curve which is beyond a +ve Z score, the associated p-value
for a Z of 0.465 for a two-tailed test is 0.638 (the tabled p-value is doubled, 0.319×2, for
a two-tailed test).
Computer Analysis
Evaluation of equation 7.2 can be easily accomplished using the SAS programme,
Runs.job. This is shown in Figure 9, Appendix A3. The following data values would be
entered into this programme: N=35, U=16, n 1 =21, n 2 =14. The relevant section of SAS
code is:
data a; ** Enter, after the cards statement, the
values for *;
** N, U, CAT1, CAT2, in this order. Each
value should *;
** be separated by a space. In this
example N=35, *;
** U=16, CAT1=21, and
CAT2=14 *;
input n u cat1 cat2;
Inferences involving rank data 217