21.11 Sample statistics 623
is the probability density for the errorε
i
1 = 1 y
i
1 − 1 f(x
i
). If the errors are independent, the
total probability density for the Nvaluesε
1
, ε
2
, =, ε
k
is the product
(21.61)
When the true functionf(x)is replaced by an approximate function, the expression
(21.61) is a measure of the likelihood of the fit, and the likelihood is maximized when
χ
2
is as small as possible.
Regression
The fitting of data points to functions is often called regressionby statisticians;
the curves about which points are clustered are called ‘regression curves’, and their
equations are ‘regression equations’. For a straight line, the regression is called linear.
The name has its origin in Galton’s studies on heredity in which he found that, on
average, fathers whose heights deviate from the mean height of all fathers have sons
whose heights deviate from the mean height of all sons by lesser amounts.
6
He called
this ‘regression to mediocrity’, and the word is widely used, particularly in the social
sciences. The type of data of interest in the physical sciences, and the purpose for
its statistical analysis, is of a quite different nature. The word regression is entirely
inappropriate as a description of the use of statistics in chemistry.
21.11 Sample statistics
In any practical experiment the number of measurements, the sample size, is
necessarily finite so that only estimatesof the parent population or distribution are
obtained from the statistics of the sample. Thus, the sample mean defined in equation
(21.2) gives an estimate of the population mean,
(21.62)
and the variance and standard deviation defined in equations (21.5) and (21.6) are
estimates of the variance and standard deviation of the population,
(21.63)
σ
22
1
2
1
≈= −
=
∑
s
N
xx
i
N
i
()
μ≈=
=
∑
x
N
x
i
N
i
1
1
=−
−
=
∏
1
1
2
2
1
2
π
i
N
i
σχexp
ρε ρε()() ( )ρε σ
12
1
1
2
N
i
N
i
=
−
=
∏
π exp −−
=
∑
1
2
1
2
i
N
i
i
ε
σ
6
Francis Galton (1822–1911), born in Birmingham. He discovered the importance of anti-cyclones in weather
systems and was influential in the establishment of the Meteorological Office and the National Physical
Laboratory. Best known for his work on heredity, he coined the word ‘eugenics’ and advocated the application of
scientific breeding to human populations.