10.4 LEARNING ALGORITHMS 413
there are predefined values that can be used instead of integers:iis the number
of attributes,othe number of class values,athe average of the two, and ttheir
sum. The default,a,was used to generate Figure 10.20(a).
The parameters learningRateand Momentumset values for these variables,
which can be overridden in the graphical interface. A decayparameter causes
the learning rate to decrease with time: it divides the starting value by the epoch
number to obtain the current rate. This sometimes improves performance and
may stop the network from diverging. The resetparameter automatically resets
the network with a lower learning rate and begins training again if it is diverg-
ing from the answer (this option is only available if the graphical user interface
is notused).
The trainingTimeparameter sets the number of training epochs. Alterna-
tively, a percentage of the data can be set aside for validation (using validation-
SetSize): then training continues until performance on the validation set starts
to deteriorate consistently—or until the specified number of epochs is reached.
If the percentage is set to zero, no validation set is used. The validationThresh-
oldparameter determines how many consecutive times the validation set error
can deteriorate before training is stopped.
The nominalToBinaryFilterfilter is specified by default in the MultilayerPer-
ceptronobject editor; turning it off may improve performance on data in which
the nominal attributes are really ordinal. The attributes can be normalized (with
normalizeAttributes), and a numeric class can be normalized too (with normal-
izeNumericClass): both may improve performance.
Lazy classifiers
Lazy learners store the training instances and do no real work until classifica-
tion time.IB1is a basic instance-based learner (Section 4.7) which finds the
training instance closest in Euclidean distance to the given test instance and pre-
dicts the same class as this training instance. If several instances qualify as the
closest, the first one found is used.IBkis a k-nearest-neighbor classifier that uses
the same distance metric. The number of nearest neighbors (defaultk=1) can
be specified explicitly in the object editor or determined automatically using
leave-one-out cross-validation, subject to an upper limit given by the specified
value. Predictions from more than one neighbor can be weighted according to
their distance from the test instance, and two different formulas are imple-
mented for converting the distance into a weight. The number of training
instances kept by the classifier can be restricted by setting the window size
option. As new training instances are added, the oldest ones are removed to
maintain the number of training instances at this size.KStaris a nearest-
neighbor method with a generalized distance function based on transforma-
tions (Section 6.4, pages 241–242).