Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

10.2 EXPLORING THE EXPLORER 387

make sense for numeric prediction). Section 5.8 (Table 5.8) explains the meaning of the various measures. Ordinary linear regression (Section 4.6), another scheme for numeric prediction, is found under LinearRegressionin the functionssection of the menu in Figure 10.4(a). It builds a single linear regression model rather than the two in Figure 10.11; not surprisingly, its performance is slightly worse. To get a feel for their relative performance, let’s visualize the errors these schemes make, as we did for the Iris dataset in Figure 10.6(b). Right-click the entry in the history list and select Visualize classifier errorsto bring up the two- dimensional plot of the data in Figure 10.12. The points are color coded by class—but in this case the color varies continuously because the class is numeric. In Figure 10.12 the Vendorattribute has been selected for the X-axis and the instance number has been chosen for the Y-axis because this gives a good spread of points. Each data point is marked by a cross whose size indicates the absolute value of the error for that instance. The smaller crosses in Figure 10.12(a) (for M5¢), when compared with those in Figure 10.12(b) (for linear regression), show that M5¢is superior.

+ 0.0162 * MMIN + 0.0086 * MMAX + 0.8332 * CACH

1.2665 * CHMIN

1.2741 * CHMAX

107.243

Number of Rules : 2

Time taken to build model: 1.37 seconds

=== Cross-validation === === Summary ===

Correlation coefficient 0.9766 Mean absolute error 13.6917 Root mean squared error 35.3003 Relative absolute error 15.6194 % Root relative squared error 22.8092 % Total Number of Instances 209

+ 0.012 * MYCT

Figure 10.11(continued)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Get our desktop app

Company

Features

Documentation

Resources