Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1

10.4 LEARNING ALGORITHMS 403


There is a supervised version of the NominalToBinaryfilter that transforms
all multivalued nominal attributes to binary ones. In this version, the transfor-
mation depends on whether the class is nominal or numeric. If nominal, the
same method as before is used: an attribute with kvalues is transformed into k
binary attributes. If the class is numeric, however, the method described in
Section 6.5 (page 246) is applied. In either case the class itself is not transformed.
ClassOrderchanges the ordering of the class values. The user determines
whether the new ordering is random or in ascending or descending order of
class frequency. This filter must not be used with the FilteredClassifiermeta-
learning scheme! AttributeSelectioncan be used for automatic attribute selec-
tion and provides the same functionality as the Explorer’s Select attributespanel
(described later).

Supervised instance filters
There are three supervised instance filters.Resampleis like the eponymous un-
supervised instance filter except that it maintains the class distribution in the
subsample. Alternatively, it can be configured to bias the class distribution
towards a uniform one.SpreadSubsamplealso produces a random subsample,
but the frequency difference between the rarest and the most common class can
be controlled—for example, you can specify at most a 2 : 1 difference in class
frequencies. Like the unsupervised instance filter RemoveFolds, Strati-
fiedRemoveFoldsoutputs a specified cross-validation fold for the dataset, except
that this time the fold is stratified.

10.4 Learning algorithms


On the Classifypanel, when you select a learning algorithm using the Choose
button the command-line version of the classifier appears in the line beside the
button, including the parameters specified with minus signs. To change them,
click that line to get an appropriate object editor. Table 10.5 lists Weka’s classi-
fiers. They are divided into Bayesian classifiers, trees, rules, functions, lazy clas-
sifiers, and a final miscellaneous category. We describe them briefly here, along
with their parameters. To learn more, choose one in the Weka Explorer inter-
face and examine its object editor. A further kind of classifier, the Metalearner,
is described in the next section.

Bayesian classifiers


NaiveBayes implements the probabilistic Naïve Bayes classifier (Section
4.2).NaiveBayesSimpleuses the normal distribution to model numeric attrib-
utes.NaiveBayescan use kernel density estimators, which improves perform-
ance if the normality assumption is grossly incorrect; it can also handle numeric
Free download pdf