Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1
Supervised attribute filters
Discretize,highlighted in Figure 10.17, uses the MDL method of supervised dis-
cretization (Section 7.2). You can specify a range of attributes or force the dis-
cretized attribute to be binary. The class must be nominal. By default Fayyad
and Irani’s (1993) criterion is used, but Kononenko’s method (1995) is an
option.

402 CHAPTER 10 | THE EXPLORER


(a) (b)
Figure 10.17Using Weka’s metalearner for discretization: (a) configuring FilteredClas-
sifier, and (b) the menu of filters.

Table 10.3 Supervised attribute filters.

Name Function

AttributeSelection Provides access to the same attribute selection methods as the
Select attributespanel
ClassOrder Randomize, or otherwise alter, the ordering of class values
Discretize Convert numeric attributes to nominal
NominalToBinary Convert nominal attributes to binary, using a supervised method
if the class is numeric

Table 10.4 Supervised instance filters.

Name Function

Resample Produce a random subsample of a dataset, sampling with
replacement
SpreadSubsample Produce a random subsample with a given spread between
class frequencies, sampling with replacement
StratifiedRemoveFolds Output a specified stratified cross-validation fold for the dataset
Free download pdf