Statistical Analysis for Education and Psychology Researchers

(Jeff_L) #1

second cell along in the first row, the correlation, rs, between the variables FSME and
CAT is 0.890. This is significant at the 1 per cent level, the actual probability is
p=0.0005. Notice that a variable that is correlated with itself is always one.
The null hypothesis tested is that the population correlation, Rho is zero. This is what
the heading on the output refers to, |R| under H 0 : Rho=0. This is an approximate one-
tailed test, the probability printed in the SAS output is the one-tailed p-value associated
with the observed correlation in a predicted direction, in this example positive. If a two-
tailed test is required, the p-value should be doubled (no assumption would be made
about the direction of the relationship). In this example the correlation for a two-tailed
test would have an associated probability of 0.001, that is it would be significant at the 1
per cent level.
The significant p-value means that the null hypothesis can be rejected, we conclude
there is strong evidence that the true population correlation is non-zero. The upward trend
in the scatterplot reflects this.


7.3 One-sample Runs Test for Randomness

When to Use

This test is used whenever we want to conclude that a series (or run) of binary events is
random. The inference underlying this test is that the order (sequence) of observations
obtained is based on the sample being random. Many statistical procedures are based on
the assumption of random sampling. The runs test enables a test of this assumption if
randomness of the sample is suspect. The test can be used as part of initial data analysis.
For example, in a regression analysis it is often necessary to examine the distribution of
residuals (the difference between an observed value of the response variable and the
value fitted by the regression model). Residuals are either positive or negative and the
signs of the residuals are lined up in the sequence in which they occur. A run is a
sequence of identical events (here + or −) that is preceded or followed by a different
event or no event at all (beginning and end of a sequence). A lack of randomness in the
pattern of residuals is shown by either too few or too many runs and this would indicate
that one or more of the assumptions underlying the regression analysis has been violated.
The runs test uses information about the order of events unlike nominal test procedures
such as the Chi-square test which use information about the frequency of events.


Statistical Inference and Null Hypothesis

The inference on which the test is based is that the total number of runs in a sample of
observations provides an indication of the randomness of the sample. The null hypothesis
is that the pattern of events is determined by a random process. There are two one-sided
alternative hypotheses, the pattern is not random because there are either too few or too
many runs to be attributed to chance. A two-sided alternative hypothesis is that the
pattern of runs is not random. The test statistic is U, the number of runs. The exact
sampling distribution of U is known. For samples where the frequency of events in either
of the binary categories is >20 a large sample approximation to the sampling distribution


Statistical analysis for education and psychology researchers 214
Free download pdf