chosen by the researcher, before the statistical hypothesis is tested, is compared with the
p-value derived from the statistical test. If the obtained p-value for the statistical test is
less than or equal to the chosen alpha level then the null hypothesis is rejected and the
results are said to be significant at the chosen alpha level. You should remember that
even when we say a result is statistically significant at the 1 per cent level, there remains
a possibility that the result is a chance result, we are only 99 per cent certain and not 100
per cent certain.
It is this author’s view that too much emphasis is placed on the use of p-values when
testing hypotheses and publishing results. Statistical significance does not equate with
educational or clinical significance. Moreover, the magnitude of any differences (or
effect size—this is referred to in Chapter 5) is likely to be more informative than whether
results are significant or not significant. An alternative strategy to simply reporting p-
values as significant or not significant, is to use and report confidence intervals alongside
p-values.
A confidence interval provides a range of plausible values in which the parameter of
interest lies. Just as we can calculate a CI0.95 for a parameter, recall in Example 4.6 we
estimated CI0.95 for the proportion of age 11 pupils who cannot correctly interpret graphs
with complex scales, so it is possible to calculate a CI0.95 for the difference between two
sample proportions (which is the best estimate of the population difference in
proportions). Just as the 5 per cent level of significance is generally used, so the CI0.95 is
commonly used although alternative confidence intervals can be constructed, for
example, CI 0. 99. When reporting results of hypothesis tests using confidence intervals the
following should be included: sample estimates, confidence intervals, test statistics and
associated degrees of freedom, and associated p-values. If a confidence interval of a
difference excludes zero then this is evidence of a significant difference and will coincide
with a significant p-value. The advantage of reporting a confidence interval is that it
conveys a range of values for the population difference although the actual population
difference is likely to be near the centre of the confidence interval. For an informative
introduction to testing hypotheses using confidence intervals with worked examples, the
reader is referred to Gardner and Altman’s (1989) text Statistics with Confidence.
You may wonder why discuss hypothesis testing and p-values if confidence intervals
are more appropriate? The answer is simple, reporting of p-values is so common that you
need to grasp the fundamental idea to be able to evaluate reports and papers. Throughout
the remaining chapters on inferential statistical procedures, in addition to p-values,
wherever appropriate confidence intervals will also be used.
One-tailed and Two-tailed Significance Tests
The null hypothesis is always contrasted with an alternative frame of reference, called the
alternative hypothesis (sometimes called the research hypothesis). This has the special
notation H 1. Statistical significance tests can be either one-tailed or two-tailed depending
on the nature of the alternative hypothesis. Consider the proportion of referrals that result
in statements (Example 4.10), the null hypothesis is:
H 0 : p=π=0.130 (null hypothesis, P is the sample proportion)
Statistical analysis for education and psychology researchers 110