Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1

6.2 CLASSIFICATION RULES 211


It is used to split the training data into two subsets: one containing all instances
for which the rule’s condition is trueand the other containing those for which
it is false.If either subset contains instances of more than one class, the algo-
rithm is invoked recursively on that subset. For the subset for which the condi-
tion is true, the “default class” is the new class as specified by the rule;
for the subset for which the condition is false,the default class remains as it was
before.
Let’s examine how this algorithm would work for the rules with exceptions
given in Section 3.5 for the Iris data of Table 1.4. We will represent the rules in
the graphical form shown in Figure 6.7, which is in fact equivalent to the textual
rules we gave in Figure 3.5. The default ofIris setosais the entry node at the top
left. Horizontal, dotted paths show exceptions, so the next box, which contains
a rule that concludes Iris versicolor,is an exception to the default. Below this is
an alternative, a second exception—alternatives are shown by vertical, solid
lines—leading to the conclusion Iris virginica.Following the upper path along
horizontally leads to an exception to the Iris versicolorrule that overrides it
whenever the condition in the top right box holds, with the conclusion Iris vir-
ginica.Below this is an alternative, leading (as it happens) to the same conclu-
sion. Returning to the box at bottom center, this has its own exception, the lower
right box, which gives the conclusion Iris versicolor.The numbers at the lower
right of each box give the “coverage” of the rule, expressed as the number of


--> Iris setosa
50/150

petal length ≥ 2.45
petal width < 1.75
petal length < 5.35

petal length ≥ 4.95
petal width < 1.55

sepal length < 4.95
sepal width ≥ 2.45

petal length < 4.85
petal length ≥ 3.35 sepal length < 5.95v

--> Iris versicolor
49/52

--> Iris virginica
2/2

--> Iris virginica
1/1

--> Iris virginica
47/48 --> Iris versicolor1/1

Exceptions are
represented as
dotted paths,
alternatives as
solid ones.

Figure 6.7Rules with exceptions for the iris data.

Free download pdf