Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

6.3 EXTENDING LINEAR MODELS 231

So far so good. But all this assumes that there is no hidden layer. With a
hidden layer, things get a little trickier. Suppose f(xi) is the output of the ith
hidden unit,wijis the weight of the connection from input j to the ith hidden
unit, and wiis the weight of the ith hidden unit to the output unit. The situa-
tion is depicted in Figure 6.13. As before,f(x) is the output of the single unit in
the output layer. The update rule for the weights wiis essentially the same as
above, except that aiis replaced by the output of the ith hidden unit:

However, to update the weights wijthe corresponding derivatives must be cal-
culated. Applying the chain rule gives

The first two factors are the same as in the previous equation. To compute the
third factor, differentiate further. Because

dE dw

dE dx

dx dw

yfxfx

dx ij ij dwij

==-( ( )) ¢( ).

dE dw

yfxfxfx i

=-( ( )) ¢( ) ( i).

hidden unit 0

input a 0 input a 1 input ak

w^0 f(x 1 )

hidden unit 1

hidden unit l

output unit

w

1

f(x 2 )

wl

f(xl)

w^00 w^10 wl0

w 01

w

(^11) wl1
w
w lk
w0k 1k
f(x)
Figure 6.13Multilayer perceptron with a hidden layer.

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Get our desktop app

Company

Features

Documentation

Resources