178 CHAPTER 5. NEURAL NETWORKS FOR CONTROL
Therefore,
4 wjk =
XN
q=1
4 qwjk
= −η
XN
q=1
∂Eq
∂wjk
=
XN
q=1
−ηδqjxqk
The delta rule leads to the search for a weight vectorw∗such that the gradient
ofE atw∗is zero. For some simple single-layer neural networks,w∗is the
unique absolute minimum ofE(w).
Example 5.3In this example, we implement the logical function AND by
minimizing the errorE. Consider the training set T consisting of binary-
inputs/bipolar-targets:
T={(xq,yq),q=1, 2 , 3 , 4 }
withxq∈{ 0 , 1 }^2 andyq∈{− 1 , 1 }. Specifically,
x^1 =
°
x^11 ,x^12
¢
=(1,1) y^1 =1
x^2 =
°
x^21 ,x^22
¢
=(1,0) y^2 =− 1
x^3 =
°
x^31 ,x^32
¢
=(0,1) y^3 =− 1
x^4 =
°
x^41 ,x^42
¢
=(0,0) y^4 =− 1
An appropriate neural network architecture for this problem is the following,
with linear activation functionf(x)=x.
We have
E=
X^4
q=1
°
xq 1 w 1 +xq 2 w 2 −w 0 −yi
¢ 2
The weight vector(w∗ 0 ,w∗ 1 ,w 2 ∗)that minimizesEis the solution of the system
of equations
∂E
∂wj
=0,j=0, 1 , 2