Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1
bution of the playattribute. Unfortunately, this display is impoverished because
the attribute has so few different values that they fall into two equal-sized bins.
More realistic datasets yield more informative histograms.

Training and testing learning schemes

The Classifypanel lets you train and test learning schemes that perform classi-
fication or regression. Section 10.1 explained how to interpret the output
of a decision tree learner and showed the performance figures that are auto-
matically generated by the evaluation module. The interpretation of these is the
same for all models that predict a categorical class. However, when evaluating
models for numeric prediction, Weka produces a different set of performance
measures.
As an example, in Figure 10.10(a) the CPU performance dataset from Table
1.5 (page 16) has been loaded into Weka. You can see the histogram of values
of the first attribute,vendor,at the lower right. In Figure 10.10(b) the model
tree inducer M5¢has been chosen as the classifier by going to the Classifypanel,
clicking the Choosebutton at the top left, opening up the treessection of the
hierarchical menu shown in Figure 10.4(a), finding M5P,and clicking Start.The
hierarchy helps to locate particular classifiers by grouping items with common
functionality.

384 CHAPTER 10 | THE EXPLORER


Figure 10.9The weather data with two attributes removed.
Free download pdf