21 Probability and statistics
21.1 Concepts
The practice of statistics, the collection and statistical analysis of data, has become a
ubiquitous activity of everyday life, and the subject of a vast literature. Three broad
areas of statistical activity may be distinguished:
(i) the practical problems of the design of experiments and of the choice and
collection of suitable samples of data (‘good experimental practice’);
(ii) the description and presentation of the data and its analysis in terms of
appropriate theoretical models;
(iii) the use of the analysis to draw conclusions about the nature of the system under
investigation, the quality of the experiment, and the method of collection of data.
Our main concern in this chapter is an introduction to probability theory, the
mathematical theory of statistics.
1
Probability theory provides the theoretical models
and analytical tools for the organization, interpretation, and analysis of statistical
data. Probability theory has additional importance in chemistry in, for example, the
description of the collective behaviour of very large numbers of particles in statistical
mechanics, the quantum mechanical descriptions of changes of state and of rate
processes, the physical interpretation of wave functions, and the enumeration of the
ways of assembling basic chemical units to form large molecules, as in the construction
of polypeptides from amino acid residues. Conventional (‘practical’) statistics is
represented by brief discussions of descriptive statistics in Section 21.2 and the method
of least squares in Section 21.10. There exist several specialist texts on the use of
statistical methods in the physical sciences, and the reader should consult one of these
for a more comprehensive discussion, and for a proper appreciation of the power and
wide range of applications of statistics in the sciences, and in many other fields of study.
21.2 Descriptive statistics
Table 21.1 shows a set of 50 numbers that represent the results of an experiment that
involves the counting of events.
1
Probability theory has its origins in ancient times in divination and in games of chance, and has continued to
be used for these purposes to the present day. Oresme used a probability argument to conclude that astrology must
be false. The modern theory arose from the correspondence of Pascal and Fermat in 1654 following the questions
on the results of throwing dice put to Pascal by the gambler de Méré. Pascal’s ideas were included in his Treatise
on the arithmetic triangle. The throwing of dice had been discussed earlier by Cardano in his Liber de ludo aleae
(Book on games of chance) in 1526, but the earliest systematic account was by Huygens in the De ratiociniis in
aleae ludo(Calculations in games of chance) of 1657. The first substantial treatment of mathematical probability
theory was Jakob Bernoulli’s Ars conjectandi(The art of conjecture) published in 1713 in which he also proposed
the application of probabilities in the social sciences. de Moivre discussed the law of errors and the normal
distribution in the second edition of the Doctrine of chances(1718, 1738, 1756). Euler and d’Alembert wrote on
problems of life expectancy, insurance, lotteries, and others, and the extensive application of statistical methods,
particularly in the social sciences, followed the publication of Laplace’s definitive Théorie analytique des probabilités
in 1812 (‘at the bottom, the theory of probabilities is only common sense in numbers’).