176 CHAPTER 5. NEURAL NETWORKS FOR CONTROL
Figure 5.5. Network with weightswij
The goal is to minimizeEwith respect to the weightswijof the network.
Letwijdenote the weight from the nodejto the output neuroni.Tousethe
gradient descent methodfor optimization, we need thewijís to be differentiable.
This boils down to requiring
oqi=fi
Xn
j=0
wijxqj
to be differentiable. That is, the activation functionfiof theithneuron should
be chosen to be a differentiable function. Note that step functions such as
f(x)=
Ω
1 ifx≥ 0
0 ifx< 0
are not differentiable.
The sigmoid activation function, shown in Figure 5.6, is differentiable. Note
0
0.2
0.4
0.6
0.8
-10 -8 -6 -4 -2 (^24) x 6 8 10
Figure 5.6.f(x)=1+^1 e−x