180 CHAPTER 5. NEURAL NETWORKS FOR CONTROL
The following popular learning algorithm, referred to as thebackpropaga-
tion algorithm, is a generalization of the delta rule [42]. If we look closely
at the delta rule for single-layer neural networks, we realize that to update a
weightwikwhen the learning input patternxqis presented, we need
4 wqik=−ηδqixqk
That is, we need to compute the function
δqi=(oqi−yqi)f^0
Xn
j=0
wijxqj
and this is possible since we have at our disposal the valueyiqthat is known to
us as the target output for the output nodei.
Consider, for simplicity, the two-layern-m-pneural network depicted in Fig-
ure 5.8. It seems that we cannot update a weight likewikon the link connecting
Figure 5.8. Two-layer neural network
thekthinput node in the input layer to theithhidden neuron in the hidden
layer, since we are not given the target pattern of theithneuron from the input
xq. Note that patternsyqare target patterns of neurons in the output layer,
and not the hidden layers. Thus, when we look at the errors
°
oqj−yqj
¢ 2
at the
output layer, we cannot detect which hidden neurons are responsible.
It turns out that to update these weights, it suffices to be able to compute
∂E
∂oi, the partial derivative of the global errorE with respect to the output
oi,i =1,...,p.Fordifferentiable activation functions, the gradient descent
strategy can still be applied tofind the networkís weight configurationw∗that
minimizes the errorE.