- PROBABILITY 521
Except for using the letter c where we now use e to denote the base of natural loga-
rithms, he had what we now call the central limit theorem for independent uniformly
distributed random variables.
1 .8. Legendre. In a treatise on ways of determining the orbits of comets, pub-
lished in 1805, Legendre dealt with the problem that frequently results when obser-
vation meets theory. Theory prescribes a certain number of equations of a certain
form to be satisfied by the observed quantities. These equations involve certain
theoretical parameters that are not observed, but are to be determined by fitting
observations to the theoretical model. Observation provides a large number of em-
pirical, approximate solutions to these equations, and thus normally provides a
number of equations far in excess of the number of parameters to be chosen. If the
law is supposed to be represented by a straight line, for example, only two constants
are to be chosen. But the observed data will normally not lie on the line; instead,
they may cluster around a line. How is the observer to choose canonical values for
the parameters from the observed values of each of the quantities?
Legendre's solution to this problem is now a familiar technique. If the theo-
retical equation is y — /(x), where f(x) involves parameters á, â,..., and one has
data points (xjt, t/t), k = 1,..., n, sum the squares of the "errors" f(xk) — j/t to get
an expression in the parameters
ç
£(«,/?,...) = £ (/(xfc)-j/fc)^2 ,
fe=l
and then choose the parameters so as to minimize E. For fitting with a straight
line y — ax + b, for example, one needs to choose E(a, b) given by
ç
E{a,b) = ^2(axk + b-yk)^2
k=l
so that
0£ 0 <W
da db
1.9. Gauss. Legendre was not the first to tackle the problem of determining the
most likely value of a quantity ÷ using the results of repeated measurements of it,
say Xfc, k = 1,... ,n. In 1799 Laplace had tried the technique of taking the value ÷
that minimizes the sum of the absolute errors^12 |x — xk- But still earlier, in 1794
as shown by his diary and correspondence, the teenager Gauss had hit on the least-
squares technique for the same purpose. However, as Reich (1977, p. 56) points
out, Gauss did not consider this discovery very important and did not publish it
until 1809. In 1816 Gauss published a paper on observational errors, in which he
discussed the most probable value of a variable based on a number of observations
of it. His discussion was much more modern in its notation than those that had
gone before, and also much more rigorous. He found the likelihood of an error of
size ÷ to be
where h was what he called the measure of precision. He showed how to estimate
this parameter by inverse-probability methods. In modern terms, l/\/2h is the
(^12) This method has the disadvantage that one large error and many small errors count equally.
The least-squares technique avoids that problem.