probability. Probability theory is central to statistical inference and statistical tests
because it enables the effect of chance variation to be accounted for in our decision
making. Put simply, tests of significance are methods for assessing the strength against
the null hypothesis and the strength of this evidence is given by the obtained p-value for
the statistical test. The statistic that should be used in a hypothesis test or when estimating
confidence intervals is the statistic that estimates the parameter of interest stated or
implied in the null hypothesis.
Any particular parametric statistic will have a known sampling distribution. Possible
values of the test statistic and associated probability of occurrence under the null
hypothesis are usually tabulated in statistical tables. When data is analyzed and the
computations yield a particular test statistic value, (sometimes referred to as the observed
test statistic), this will have an associated probability of occurrence. This observed
probability is compared with a pre-selected probability or alpha level, commonly called
the level of significance of a test (generally 5% p≤.05 or 1% p≤.01).
Usually, if the observed probability, p, is ≤ the selected alpha level of probability
(probability of making a Type I error, see Chapter 4) then the null hypothesis is rejected
and statistical significance is attained. It is good practice to state confidence intervals,
such as confidence intervals of a difference for a t-test, as well as the observed test
statistic and associated probability level, and if appropriate, degrees of freedom.
As the formal process of inference is based on the sampling distribution of a chosen
statistic, there should be an underlying statistical or probability model for the statistical
test. For example, the normal probability distribution is a common statistical model that
describes probability distributions of variables and is the basic probability model
underlying a number of statistical tests. Statistical tests whose inferences are based on the
normal distribution are called parametric statistical procedures. Inferences using
parametric statistical procedures are only likely to be valid when four conditions are met:
- observations are independent;
- they are drawn randomly from a population;
- they have continuous levels of measurement (at least in theory);
- and the random errors associated with observations or measures have a known
 distribution, (usually normal).
The manner of sampling and level of measurement of variables in an empirical
investigation therefore influences the validity of the underlying statistical model and
hence the choice of statistical test. These three conditions mentioned above are a
consequence of the central limit theorem (see Chapter 4).
Some statistical tests require that additional assumptions be met; these assumptions
vary in number and degree. Moreover, there is much debate amongst statisticians as to
the conditions under which particular assumptions are important. For example, amongst
the most powerful statistical tests are the t- and F-tests. The t-test (for testing a hypothesis
about the difference between two sample means) requires in addition to assumptions
underlying the general parametric model, the condition that the populations from which
the two samples are drawn should have similar variances (homogeneity of variances
assumption, see t-test Chapter 8). Different statistical tests require different assumptions
and the practical implications of these assumptions for research design and analysis are
discussed in later chapters when each statistical test is introduced. Generally, the more
Statistical analysis for education and psychology researchers 118