- PROBABILITY 523
the probability of its doing so were really only 50% because there are precisely
two possible outcomes, most of us would not bother to buy an automobile. Surely
Poisson was assuming some kind of symmetry that would allow the imagination to
assign equal likelihoods to the outcomes, and intending the theory to be applied
only in those cases. Still, in the presence of ignorance of causes, equal probabilities
seem to be a reasonable starting point. The law of entropy in thermodynamics, for
example, can be deduced as a tendency for an isolated system to evolve to a state
of maximum probability, and maximum probability means the maximum number
of equally likely states for each particle.
1.11. Large numbers and limit theorems. The idea of the law of large num-
bers was stated imprecisely by Cardano and with more precision by Jakob Bernoulli.
To better carry out the computations involved in using it, de Moivre was led to
approximate the binomial distribution with what we now realize was the normal
distribution. He, Laplace, and Gauss all grasped with different degrees of clar-
ity the principle (central limit theorem) that when independent measurements are
averaged, they tend to shape themselves into the bell-shaped curve.
The law of large numbers was given its name in the 1837 work of Poisson just
mentioned. Poisson discovered an approximation to the probability of getting at
most k successes in ç trials, valid when ç is large and the probability ñ is small.
He thereby introduced what is now known as the Poisson distribution, in which the
probability of k successes is given by
-ë ë*
The Russian mathematician Chebyshev introduced the concept of a random
variable and its mathematical expectation. He is best known for his 1846 proof of
the weak law of large numbers for repeated independent trials. That is, he showed
that the probability that the actual proportion of successes will differ from the
expected proportion by less than any specified å > 0 tends to 1 as the number
of trials increases. In 1867 he proved what is now called Chebyshev's inequality:
The probability that a random variable will assume a value more than [what is now
called] k standard deviations from its mean is at most 1/k^2. This inequality was
published by Chebyshev's friend and translator Irenee-Jules Bienayme (1796 1878)
and is sometimes called the Chebyshev-Bienayme inequality (see Heyde and Seneta,
1977). This inequality implies the weak law of large numbers. In 1887 Chebyshev
also gave an explicit statement of the central limit theorem for independent random
variables.
The extension of the law of large numbers to dependent trials was achieved
by Chebyshev's student Andrei Andreevich Markov (1856-1922). The subject of
dependent trials—known as Markov chains—remains an object of current research.
In its simplest form it applies to a system in one of a number of states {Si,..., Sn}
which at specified times may change from one state to another. If the probability
of a transition from Si to Sj is Pij, the matrix
(Pu Pln\
P =
\Pnl PnnJ