are satisfied simultaneously. Hence, the constants β 0 and β 1 must be selected to
satisfy the simultaneous equations
∂E
∂β 0 = 2
∑n
i=1
(β 0 +β 1 xi−yi) (1) =0
∂E
∂β 1 = 2
∑n
i=1
(β 0 +β 1 xi−yi) (xi) =0.
(11 .89)
The equations (11.89) simplify to the 2 × 2 linear system of equations
nβ 0 +
(n
∑
i=1
xi
)
β 1 =
∑n
i=1
yi
(n
∑
i=1
xi
)
β 0 +
(n
∑
i=1
x^2 i
)
β 1 =
∑n
i=1
xiyi
(11 .90)
which can then be solved for the coefficients β 0 and β 1. This gives the ”best” least
squares straight line y=y(x) = β 0 +β 1 x.
Alternatively, set all of the equations (11.86) equal to zero, to obtain a system
of equations having the matrix form
Aβ=y
1 x 1
1 x 2
1 x 3
..
.
..
.
1 xn
[
β 0
β 1
]
=
y 1
y 2
y 3
..
.
yn
.
(11 .91)
By doing this the data set of errors, calculated from the difference in the data set y
values and the straight line yvalues, is represented as an over determined system of
equations for determining the constants β 0 and β 1. That is, there are more equations
than there are unknowns and so the unknowns β 0 , β 1 are selected to minimize the sum
of squares error associated with the over determined system of equations. Observe
that left multiplying both sides of equation (11.91) by the transpose matrix AT gives
the new set of equations ATAβ=ATyor
[
1 1 1... 1
x 1 x 2 x 3... x n
]
1 x 1
1 x 2
1 x 3
..
.
..
.
1 xn
[
β 0
β 1
]
=
[
1 1 1... 1
x 1 x 2 x 3... x n
]
y 1
y 2
y 3
..
.
yn