choose alpha, your cut-off significance level at p≤ 0.05. Even if there was in reality no
difference between two treatment groups, a hypothesis test based on formal statistical
inference will give a difference by chance, 5 per cent of the time. If you were to
conduct 5 t-tests on your sample and in reality there were no differences between
treatments, you will detect a significant difference (spurious treatment effect) with
probability about 0.23 (1−(0.95)^5 ) or you would have a 23 per cent chance of detecting
a difference somewhere. If you conduct 10 statistical tests then you will have about a
40 per cent chance of detecting a significant difference even if one does not exist.
Additional hypotheses may be explored but multiple significance tests should
generally be avoided or adjusted for. If we deem it necessary to consider several
comparisons then we should reduce the significance level for each comparison to
make the overall experiment error level equal to 5 per cent. For example, if we
want alpha to be 5 per cent and we make five comparisons, the p value for each
test should be 0.01 (0.05/5). So p≤0.01 is the cut-off point for attainment of
statistical significance at the 5 per cent level. There are special procedures for
post hoc t-tests following a significant F-test (see Chapter 8).
- Do not confuse statistical significance with educational or clinical significance.
5.4 Statistical Power
Whenever we conduct a statistical test of a null hypothesis we run the risk of making
either a Type I error, α, (probability of attaining statistical significance falsely), or a Type
II error, β, (the probability of not finding a population difference when one exists). For an
explanation of Type I and Type II errors see Chapter 4. In this section we will consider
how to influence, indirectly, the probability of making a Type II error and in so doing
control the statistical power of a test. The power of a statistical test, 1−β, is the
probability that statistical significance will be attained (we reject a null hypothesis) given
there is a significant difference or relationship to detect (that is H 0 is false). Put simply,
statistical power is the ability to detect a relationship or a real difference should one exist.
Sensitivity and Precision
When planning a study we usually refer to the sensitivity of an experimental design or the
precision of a survey design. Sensitivity refers to the likelihood that a real treatment
effect, if present, will be detected. We usually refer to a significant difference or
significant treatment effect meaning the experimental design is sufficiently sensitive to
detect a statistically significant and meaningful difference between treatments. In survey
design precision refers to the probable accuracy of a sample estimate. The precision of a
sample estimator, (a method for estimating the population parameter from the sample
data, for example, a sample mean) is influenced by the sample size and the variability in
the population.
Attainment of statistical significance, effect size or treatment effect (that is the
magnitude of any detectable difference) and statistical power are closely related.
Generally larger treatment effects are easier to detect than smaller treatment effects, other
things being equal. Statistical power analysis is an important part of research planning.
Choosing a statistical test 131