wHITTy & wIlSON | 393
6
7
8
9 5 4 3 2 1 0
6
7 5 4 3 2 1
0–40 40–8080–120120–160 160–200 >200^0
0–40 40–8080–120120–160 160–200 >200
Spend (£) Spend (£)
Customer spend Average spend per 10 customers
Fre
quency
Fre
quency
figure 36.2 Achieving a normal distribution by summing non-normally distributed data.
zero mark, its observed frequency and the observed standard deviation are both reduced to 1
(there are textbook equations that tell you how to do this), and I then look up in my table to find
the area under the curve beyond whatever 2 m has been reduced to, and read off the value—
perhaps it is 0.0001 (one woman in every 10,000).
The question that Turing addressed was: how does one make predictions about data that
are not normally distributed, and for which no standardized tables have been drawn up? The
answer is that, while an individual measurement may not be predictable, a sum of n measure-
ments (or an arithmetic mean) is: when n is large enough, summations and means fit a normal
distribution. The idea is illustrated by the example depicted in Fig. 36.2: a ladies’ outfitter can
forecast sales of various sizes of clothes because sizes are normally distributed. The spend per
customer is likely to be more erratically distributed, as shown on the left of the figure; however,
the average spend for every ten customers approximates a normal distribution, as shown on the
right. So, after suitably standardizing the data, our merchant can turn to those well-thumbed
tables of probabilities and the bank manager can be soothed with reliable estimates based on
well-behaved average spending patterns.
This example of finding order in chaos has a long history; a version for coin tossing dates
back to Abraham de Moivre in 1738. But in 1930s Cambridge the general phenomenon was
regarded as what is known in the trade as a ‘folk theorem’. Recent developments in probabil-
ity theory in mainland Europe and Russia were apparently unknown in Cambridge, or were
disregarded. So Turing set himself the task of understanding, from first principles, why the
phenomenon worked, writing in the preface to his dissertation:
My paper originated as an attempt to make rigorous the ‘popular’ proof mentioned in Appendix
B. I first met this proof in a course of lectures by Prof. Eddington.
Eddington’s lectures took place in the autumn of 1933; Turing’s research was concluded,
according to a classic 1995 article by Zabell,^4 ‘no later than February 1934’. In that short space
of time Turing had fast-forwarded through 200 years of the history of probability! He derived
something almost as good as the 1922 version of the central limit theorem due to the Finnish
mathematician Jarl Waldemar Lindeberg (who was himself initially working in ignorance of
earlier Russian work). Lindeberg’s theorem is very general: the n measurements that are being
summed need not even come from the same distribution, so our ladies’ outfitter need not worry