A First Course in FUZZY and NEURAL CONTROL

(singke) #1
192 CHAPTER 5. NEURAL NETWORKS FOR CONTROL

5.8 Practicalissuesintraining......................


There is no prescribed methodology that predetermines a neural network archi-
tecture for a given problem. Some trial and error is required to determine a
sufficiently suitable model. The following is a list of several aspects to keep in
mind when selecting an appropriate neural network structure.



  1. More than one hidden layer may be beneficial for some applications, but
    in general, one hidden layer is sufficient.

  2. The learning rate should be chosen in the open interval(0,1),sinceal-
    though largeηmight result in a more rapid convergence, smallηavoids
    overshooting the solution.

  3. Training a neural network means creating a general model for an input-
    output relationship from samples. This model can be applied to new data
    sets of the problem, that is, can generalize to new data.

  4. Overfitting means performing a poor generalization on new data. This
    happens when the number of parameters (weights) is greater than the
    number of constraints (training samples).

  5. There are no general conclusions about how many neurons should be in-
    cluded in the hidden layer.

  6. The choice of initial weights will influence whether the neural network
    reaches a global or local minimum of the errorE,andifso,howquickly
    it converges (a property of the gradient descent method). The update of
    the weightswikdepends on bothfk^0 of the upper layer and the output of
    the neuroniin the lower layer. For this reason, it is important to avoid
    choices of the initial weights that would make it likely that either of these
    quantities is zero.
    Initial weights must not be too large, or the initial input signals will be
    likely to fall into the region where the derivative of the activation function
    has a very small value (the saturation region). On the other hand, if the
    initial weights are too small, the net input to a hidden neuron or output
    neuron will be close to zero, which also causes extremely slow learning.
    As a common procedure, the initial weights are chosen at random, either
    between− 1 and 1 or in some other appropriate interval.


7.Howlongdoweneedtotrainaneuralnetwork? Onecoulddividea
training set into two disjoint subsets: I and II. Use I to train the neural
network and use II for testing. During the training, one could compute the
errors from II. If these errors decrease, then continue the training. If they
increase, then stop the training because the neural network is starting to
memorize the set I too specifically and consequently is losing its ability to
generalize.
Free download pdf