Statistical Analysis for Education and Psychology Researchers

(Jeff_L) #1
Using Confidence Intervals for Significance Testing

In a typical between-subjects design to see whether there is any difference in mean test
anxiety scores between boys and girls, the null and alternative hypotheses might be:
H 0 : μ 1 =μ 2
H 1 : μ 1 ≠μ 2


If the 95 per cent confidence interval for the difference in means, μ 1 −μ 2 , does not include
zero, then we reject the null hypothesis and conclude that there is a significant difference
between boys and girls. As Gardner and Altman (1990) state,


The excessive use of hypothesis testing at the expense of more
informative approaches to data interpretation is an unsatisfactory way of
assessing and presenting statistical findings... We prefer the use of
confidence intervals, which present the results directly on the scale of data
measurement pp. 15–16.

Gardner and Altman’s book, Statistics with Confidence, although written with medical
researchers in mind, has much to offer the social science researcher. The authors present
in a very readable fashion worked examples for calculating confidence intervals with
parametric and nonparametric data.


Significance Tests—Some Caveats:


  • Remember that a 5 per cent significance level is a statement about conditional
    probability. It means that given the null hypothesis is true, then significant results (&
    consequent rejection of H 0 ) would occur only 20 times out of every 100 tests of a true
    null hypothesis—that is the results would be unlikely.

  • Report both confidence intervals, and p-values.

  • Beware of outliers (a few outliers can produce significant results).

  • Lack of statistical significance may be important—do not ignore it.

  • When interpreting treatment effectiveness research you should report the attained
    statistical power of your test (applicable for parametric procedures).

  • There should be a probability model for the data if formal statistical inference is used.

  • Statistical inference as referred to in this chapter should not be used when data is
    collected haphazardly or is biased.

  • Generally avoid fishing expeditions that is do not go searching for statistical
    significance. Decide on your hypotheses at the design stage. Set up a level of
    significance in advance, such as p≤0.05, but use this as a guide to satisfactory
    evidence (i.e., can I reject the null hypothesis) rather than an absolute decision rule for
    the outcome of a statistical test. Only when many studies and statistical tests have been
    completed on independent samples is there really sufficient evidence in favour of a
    decision (This is the specialist topic of meta-analysis, the synthesis of results of many
    tests of significance).

  • Once data has been collected it is easy to conduct many statistical tests without thinking
    about underlying assumptions and in particular your (one) sample. Suppose you


Statistical analysis for education and psychology researchers 130
Free download pdf