column 1 and a random sample for column 2, but all from one population. In this case the
null hypothesis would have to be true because the population proportion who were at
achievement status would be equal, ie π1(1)= π1(2). The extent of any difference between
sample proportions P1(1) and P1(2) would be attributable to sampling variability. The χ^2
statistic could be computed from sample data arranged in a two-way table similar to that
shown in Table 4.1. If the sampling was repeated an infinite number of times, under the
same conditions, we could plot all the values of the obtained χ^2 statistic. This would give
the sampling distribution for χ^2 when H 0 is true for a fixed sample size. We could now
select a random sample of data, the same sample size as before, compute a χ^2 value, and
compare this with what we would expect from the sampling distribution. We could reject
the null hypothesis if our χ^2 value was not what we would have expected when the null
hypothesis is true.
One other point that is worth noting, is that there is not one χ^2 distribution, but a whole
family of χ^2 distributions which are described by a single parameter, the degrees of
freedom (see p. 70 for an explanation of this concept). Every time we change the degrees
of freedom we have to use a different sampling distribution. Fortunately, theoretical
sampling distributions have been evaluated for all reasonable degrees of freedom and
these are the χ^2 tables often presented in the appendix of many statistical texts.
4.4 Discrete Random Variables
The way in which we assign probability to all possible outcomes of a random variable
depends upon whether the random variable is discrete or continuous. This distinction is
important because it will influence the choice of an underlying statistical model for the
data.
A discrete random variable is one in which all possible values of the random
variable take a countable value, for example, the number of girls in a year four class, the
number of questionnaires returned in a survey, the number of experimental tasks you
write into your research submission. It would not make sense to count half a person, a
proportion of a questionnaire (unless this was part of the study design) or some fraction
of an experimental task. The distinction between discrete and continuous is not always
clear cut in practice. For example, someone with an average IQ may have a score of
about 100 or 101 but not 101.5. IQ is therefore discrete in a measurement sense. It is,
however, almost always treated as a continuous measure. The reason is because IQ is
supposed to measure an underlying and theoretically continuous dimension of
intelligence. Many readers will be aware that the meaning of intelligence and what IQ-
like tests measure, has been, and still is an issue in which there is considerable debate.
The probability distribution of a discrete random variable (unlike a continuous
probability distribution, see ‘Continuous random variables’, pp. 104–08), has a
probability attached to each and every possible outcome. If we plot a probability
distribution for a discrete random variable it is similar to a relative frequency bar chart.
We met this in the previous chapter when describing distributions of variables. The only
difference is that we replace the relative frequency of an outcome with a probability
value.
Probability and inference 93