Pattern Recognition and Machine Learning

(Jeff_L) #1
5.4. The Hessian Matrix 255

usual by summing over the contributions from each of the patterns separately. For
the two-layer network, the forward-propagation equations are given by


aj =


i

wjixi (5.98)

zj = h(aj) (5.99)
yk =


j

wkjzj. (5.100)

We now act on these equations using theR{·}operator to obtain a set of forward
propagation equations in the form


R{aj} =


i

vjixi (5.101)

R{zj} = h′(aj)R{aj} (5.102)

R{yk} =


j

wkjR{zj}+


j

vkjzj (5.103)

wherevjiis the element of the vectorvthat corresponds to the weightwji. Quan-
tities of the formR{zj},R{aj}andR{yk}are to be regarded as new variables
whose values are found using the above equations.
Because we are considering a sum-of-squares error function, we have the fol-
lowing standard backpropagation expressions:


δk = yk−tk (5.104)
δj = h′(aj)


k

wkjδk. (5.105)

Again, we act on these equations with theR{·}operator to obtain a set of backprop-
agation equations in the form


R{δk} = R{yk} (5.106)
R{δj} = h′′(aj)R{aj}


k

wkjδk

+h′(aj)


k

vkjδk+h′(aj)


k

wkjR{δk}. (5.107)

Finally, we have the usual equations for the first derivatives of the error


∂E
∂wkj

= δkzj (5.108)

∂E
∂wji

= δjxi (5.109)
Free download pdf