598 Chapter 21Probability and statistics
measurements of a physical quantity have random errors, with positive and negative
errors equally likely and with large errors less probable than small errors. The ‘S’ shape
of the cumulative frequency graph is typical of such symmetrical distributions.
0 Exercise 2
The pictorial and graphical representations of data provide qualitative information
about the centre of a distribution, its width, and its shape. These properties are readily
quantified in terms of a small number of computed distribution statisticssuch as the
mean, the standard deviation, and the skewness.
We consider an experiment in which the possible results (outcomes) are repres-
ented by the discrete variable xwhose values form a set of kvalues {x
1
, x
2
, =, x
k
} (the
populationor sample space). If the possible results are any values in a continuous
range then the set represents kclass intervals. In a particular experiment, let the
results of Nmeasurements of x(a sample of the population) consist of
n
1
values ofx
1
, n
2
ofx
2
, =, n
k
ofx
k
The total number of measurements (the sample size) is then
(21.1)
In general, different samples of the same (parent) population have different frequency
distributions; that is, they have different sets of frequencies {n
1
, n
2
, =, n
k
}. Each
sample distribution is an approximation to the ‘true’ distribution of the parent. Small
samples differ more than large samples, and a fundamental principle underlying
statistics is that the differences between sample distributions are expected to become
small when the sample sizes become large enough, and that sample distributions tend
to the distribution of the parent population asN 1 → 1 ∞. This is sometimes called the
law of large numbers.
The problems associated with finite sample sizes will be touched upon in Section
21.11. We assume in the meantime that Nis large enough for these problems to be
unimportant.
Mean, mode, and median
The most generally useful measure of the average value of xis the arithmetic mean,
(21.2)
in which the sum is over all the Ndata values, or
(21.3)
x
N
nx nx nx
N
nx
kk
i
k
ii
=+++=
=
∑
11
11 2 2
1
()
x
N
x
i
N
i
=
=
∑
1
1
i
k
i
nN
=
∑
=
1