Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1

6.2 CLASSIFICATION RULES 205


Using global optimization


In general, rules generated using incremental reduced-error pruning in this
manner seem to perform quite well, particularly on large datasets. However, it
has been found that a worthwhile performance advantage can be obtained by
performing a global optimization step on the set of rules induced. The motiva-
tion is to increase the accuracy of the rule set by revising or replacing individ-
ual rules. Experiments show that both the size and the performance of rule sets
are significantly improved by postinduction optimization. On the other hand,
the process itself is rather complex.
To give an idea of how elaborate—and heuristic—industrial-strength rule
learners become, Figure 6.4 shows an algorithm called RIPPER, an acronym for
repeated incremental pruning to produce error reduction.Classes are examined in
increasing size and an initial set of rules for the class is generated using incre-
mental reduced-error pruning. An extra stopping condition is introduced that
depends on the description length of the examples and rule set. The description
length DLis a complex formula that takes into account the number of bits
needed to send a set of examples with respect to a set of rules, the number of
bits required to send a rule with kconditions, and the number of bits needed
to send the integer k—times an arbitrary factor of 50% to compensate for pos-
sible redundancy in the attributes. Having produced a rule set for the class, each
rule is reconsidered and two variants produced, again using reduced-error
pruning—but at this stage, instances covered by other rules for the class are
removed from the pruning set, and success rate on the remaining instances
is used as the pruning criterion. If one of the two variants yields a better

Initialize E to the instance set
Split E into Grow and Prune in the ratio 2:1
For each class C for which Grow and Prune both contain an instance
Use the basic covering algorithm to create the best perfect rule for class C
Calculate the worth w(R) for the rule on Prune, and of the rule with the
final condition omitted w(R-)
While w(R-) > w(R), remove the final condition from the rule and repeat the
previous step
From the rules generated, select the one with the largest w(R)
Print the rule
Remove the instances covered by the rule from E
Continue

Figure 6.3Algorithm for forming rules by incremental reduced-error pruning.

Free download pdf