5.5. THE BACKPROPAGATION ALGORITHM 179
Explicitly,
(1)
∂E
∂w 0
= w 1 +w 2 − 2 w 0 +1=0
(2)
∂E
∂w 1
=2w 1 +w 2 − 2 w 0 =0
(3)
∂E
∂w 2
= w 1 +2w 2 − 2 w 0 =0
Here, the solution of this system of linear equations is easy to obtain. Indeed,
subtracting(3)from(1)yieldsw 2 =1.Withw 2 =1,(2)and(3)yield
2 w 0 =2w 1 +1=w 1 +2
which impliesw 1 =1.Thenfrom(3),weget
w 0 =
w 1 +2w 2
2
=
3
2
Note that in the delta rule, as well as in the generalized delta rule that
we will consider next, we need to calculate the derivatives of the activation
functions involved. The choice of smooth activation functions is dictated by the
ranges of output variables. For example, if the range of the output of a neuron,
for some specific application, is the open interval(0,1),thenanappropriate
differentiable activation function could be the sigmoid function shown in Figure
5.7. If the output variable has range(− 1 ,1),thenanappropriateactivation
0
0.2
0.4
0.6
0.8
-10 -8 -6 -4 -2 (^24) x 6 8 10
Figure 5.7.f(x)=1+^1 e−x
function should have the same range, for examplef(x)=1+^2 e−x− 1.
5.5 The backpropagation algorithm...................
As we have seen from a simple example in the implementation of the exclusive
or Boolean function XOR, we need to consider multi-layer neural networks for
approximating general relationships. So we need learning algorithms for these
more complicated neural networks.