21.10 Least squares 619
The first three are for the performance of a class of first-year chemistry students.
Figure (a) shows the total chemistry marks (vertically) against the initial letter of the
surnames of the students (horizontally). No correlation is expected, and none is found,
withρ
x,y
1 = 1 0.03. There is some correlation between the total chemistry marks and
the maths marks, withρ
x,y
1 = 1 0.52, and greater correlation (ρ
x,y
1 = 1 0.76) between the
inorganic and organic chemistry marks. Figure (d) is typical of the results obtained
when a linear functional relation is expected to exist between the variables (a high
correlation coefficient may also be obtained even when no such direct functional
relation exists).
21.10 Least squares
Many experiments in the sciences are performed to measure pairs of physical quantities
that are known, or are suspected, to be functionally related; that is, there exists some
functiony 1 = 1 f(x). One aim of the experiment is then to confirm, or determine, the
relation. Such experiments are often designed in such a way that one of the quantities,
xsay, can be measured precisely (with negligible error), with the errors confined to y;
for example, xmight be the time in a kinetics experiment and ya concentration.
We consider a sample of Ndata points,(x
i
, y
i
), i 1 = 1 1, 2, =, N, in which the values
of xare assumed to be precise and with each value of yis associated a measure of
precision, σ
i
fory
i
. If the errors in yare random thenσ
i
is the standard deviation (or
an estimate of the standard deviation) of the normal distribution to whichy
i
belongs.
Then,y
i
1 ± 1 σ
i
means that about 68% of measurements ofy, whenx 1 = 1 x
i
, lie withinσ
i
of the meanF
i
. Figure 21.10 shows a typical plot of such a set of data points.
Each point is plotted with appropriate error bar, and the dashed line indicates that the
points can be fitted to a straight line
y 1 = 1 y(x; m, c) 1 = 1 mx 1 + 1 c (21.48)
where mand care parameters that determine the slope and intercept of the line. More
generally, the problem is to fit the data points to a (simple) model function
y 1 = 1 y(x; a
1
, a
2
, =, a
k
) 1 = 1 y(x; a) (21.49)
with kadjustable parametersa 1 = 1 (a
1
, a
2
, =, a
k
)to be determined by some criteria of
‘best fit’.
......
........
.......
...
.....
.......
........
..
....
..
...
...
..
...
...
..
..
..
..
...
...
..
...
...
..
..
x
y
..
....
......
.....
.....
.....
.....
.....
.....
.....
.....
...
...
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
..
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
....
......
.....
....
......
.....
....
......
.....
....
.
..
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
...
....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.
..
....
......
.....
....
......
.....
....
......
.....
....
....
...
...
.
....
.
..
....
....
..
..
...
.
...
...
.
....
.
...
...
.....
.
..
....
....
..
..
...
.
...
...
.
....
.
...
...
.....
.
..
....
....
..
..
...
.
...
...
.
....
.
...
...
.....
.
..
....
....
..
..
...
.
...
...
.
....
.
...
...
....
..
..
....
...
Figure 21.10