11—Numerical Analysis 284
y
x
Do this in two dimensions, fitting the given data to a straight line, and to describe the line I’ll
use vector notation, where the line is~u+α~vand the parameterαvaries over the reals. First I need
to answer the simple question: what is the distance from a point to a line? The perpendicular distance
from~wto this line requires that
d^2 =
(
~w−~u−α~v
) 2
be a minimum. Differentiate this with respect toαand you have
(~w−~u−α~v
)
.
(
−~v
)
= 0 implying αv^2 =
(
~w−~u
)
.~v
For this value ofαwhat isd^2?
d^2 =
(
~w−~u
) 2
+α^2 v^2 − 2 α~v.
(
~w−~u
)
=
(
~w−~u
) 2
−
1
v^2
[
(~w−~u).~v
] 2 (11.54)
Is this plausible? (1) It’s independent of the size of~v, depending on its direction only. (2) It depends
on only thedifferencevector between~wand~u, not on any other aspect of the vectors. (3) If I add any
multiple of~vto~u, the result is unchanged. See problem11.37. Also,can you find an easier way to get
the result?Perhaps one that simply requires some geometric insight?
The data that I’m trying to fit will be described by a set of vectors ~wi, and the sum of the
distances squared to the line is
D^2 =
∑N
1
(
~wi−~u
) 2
−
∑N
1
1
v^2
[
(~wi−~u).~v
] 2
Now to minimize this among all~uand~v I’ll first take advantage of some of the observations from the
preceding paragraph. Because the magnitude of~vdoes not matter, I’ll make it a unit vector.
D^2 =
∑(
~wi−~u
) 2
−
∑[
(~wi−~u).vˆ
] 2
(11.55)
Now to figure out~u: Note that I expect the best fit line to go somewhere through the middle of the
set of data points, so move the origin to the “center of mass” of the points.
~wmean=
∑
~wi/N and let ~wi′=~wi−~wmean and ~u′=~u−~wmean
then the sum