Combining classifiers
Vo t e provides a baseline method for combining classifiers by averaging their
probability estimates (classification) or numeric predictions (regression).
MultiSchemeselects the best classifier from a set of candidates using cross-
validation of percentage accuracy (classification) or mean-squared error
(regression). The number of folds is a parameter. Performance on training data
can be used instead.
Stackingcombines classifiers using stacking (Section 7.5, page 332) for both
classification and regression problems. You specify the base classifiers, the meta-
learner, and the number of cross-validation folds.StackingCimplements a more
efficient variant for which the metalearner must be a numeric prediction scheme
(Seewald 2002). In Grading,the inputs to the metalearner are base-level pre-
dictions that have been marked (i.e., “graded”) as correct or incorrect. For each
base classifier, a metalearner is learned that predicts when the base classifier will
err. Just as stacking may be viewed as a generalization of voting, grading gener-
alizes selection by cross-validation (Seewald and Fürnkranz 2001).
Cost-sensitive learning
There are two metalearners for cost-sensitive learning (Section 5.7). The cost
matrix can be supplied as a parameter or loaded from a file in the directory set
by the onDemandDirectoryproperty, named by the relation name and with the
extension cost. CostSensitiveClassifiereither reweights training instances accord-
ing to the total cost assigned to each class (cost-sensitive learning, page 165) or
predicts the class with the least expected misclassification cost rather than the
most likely one (cost-sensitive classification, page 164).MetaCostgenerates a
single cost-sensitive classifier from the base learner (Section 7.5, pages 319–320).
This implementation uses all bagging iterations when reclassifying training data
(Domingos 1999 reports a marginal improvement when using only those iter-
ations containing each training instance to reclassify it). You can specify each
bag’s size and the number of bagging iterations.
Optimizing performance
Three metalearners use the wrapper technique to optimize the base classifier’s
performance.AttributeSelectedClassifierselects attributes, reducing the data’s
dimensionality before passing it to the classifier (Section 7.1, page 290). You can
choose the attribute evaluator and search method using the Select attributes
panel described in Section 10.2.CVParameterSelectionoptimizes performance
by using cross-validation to select parameters. For each parameter you give a
string containing its lower and upper bounds and the desired number of incre-
ments. For example, to vary parameter -Pfrom 1 to 10 in increments of 1, use
P1 10 11. The number of cross-validation folds can be specified.
10.5 METALEARNING ALGORITHMS 417
