How the sample average is distributed
The Central Limit Theorem is one reason why the normal curve is important
to statisticians. It states that the actual distribution of the sample averages
approximates to a normal curve whatever the distribution of x. What does this
mean? In Georgina’s case, x represents the distance from the workplace and is
the average of a sample. The distribution of x in Georgina’s histogram is nothing
like a bell-shaped curve, but the distribution of is, and it is centred on μ = 20.
This is why we can use the average of a sample as an estimate of the
population average μ. The variability of the sample averages is an added
bonus. If the variability of the x values is the standard deviation σ, the variability
of is σ/√n where n is the size of the sample we select. The larger the sample
size, the narrower will be the normal curve, and the better will be the estimate of
μ.
Other normal curves
Let’s do a simple experiment. We’ll toss a coin four times. The chance of
throwing a head each time is p = ½. The result for the four throws can be
recorded using H for heads and T for tails, arranged in the order in which they
occur. Altogether there are 16 possible outcomes. For example, we might obtain
three heads in the outcome THHH. There are in fact four possible outcomes
giving three heads (the others are HTHH, HHTH, HHHT) so the probability of
three heads is 4/16 = 0.25.
With a small number of throws, the probabilities are easily calculated and
placed in a table, and we can also calculate how the probabilities are distributed.
The number of combinations row can be found from Pascal’s triangle (see page
52 ):