Pattern Recognition and Machine Learning

(Jeff_L) #1
``5.4. The Hessian Matrix 255``

usual by summing over the contributions from each of the patterns separately. For
the two-layer network, the forward-propagation equations are given by

``aj =``

``∑``

``i``

``wjixi (5.98)``

``````zj = h(aj) (5.99)
yk =``````

``∑``

``j``

``wkjzj. (5.100)``

We now act on these equations using theR{·}operator to obtain a set of forward
propagation equations in the form

``R{aj} =``

``∑``

``i``

``vjixi (5.101)``

``R{zj} = h′(aj)R{aj} (5.102)``

``R{yk} =``

``∑``

``j``

``wkjR{zj}+``

``∑``

``j``

``vkjzj (5.103)``

wherevjiis the element of the vectorvthat corresponds to the weightwji. Quan-
tities of the formR{zj},R{aj}andR{yk}are to be regarded as new variables
whose values are found using the above equations.
Because we are considering a sum-of-squares error function, we have the fol-
lowing standard backpropagation expressions:

``````δk = yk−tk (5.104)
δj = h′(aj)``````

``∑``

``k``

``wkjδk. (5.105)``

Again, we act on these equations with theR{·}operator to obtain a set of backprop-
agation equations in the form

``````R{δk} = R{yk} (5.106)
R{δj} = h′′(aj)R{aj}``````

``∑``

``k``

``wkjδk``

``+h′(aj)``

``∑``

``k``

``vkjδk+h′(aj)``

``∑``

``k``

``wkjR{δk}. (5.107)``

Finally, we have the usual equations for the first derivatives of the error

``````∂E
∂wkj``````

``= δkzj (5.108)``

``````∂E
∂wji``````

``= δjxi (5.109)``