Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

6.2 CLASSIFICATION RULES 207

description length, it replaces the rule. Next we reactivate the original building phase to mop up any newly uncovered instances of the class. A final check is made to ensure that each rule contributes to the reduction of description length, before proceeding to generate rules for the next class.

Obtaining rules from partial decision trees

There is an alternative approach to rule induction that avoids global optimiza- tion but nevertheless produces accurate, compact, rule sets. The method com- bines the divide-and-conquer strategy for decision tree learning with the separate-and-conquer one for rule learning. It adopts the separate-and-conquer strategy in that it builds a rule, removes the instances it covers, and continues creating rules recursively for the remaining instances until none are left. However, it differs from the standard approach in the way that each rule is created. In essence, to make a single rule, a pruned decision tree is built for the current set of instances, the leaf with the largest coverage is made into a rule, and the tree is discarded. The prospect of repeatedly building decision trees only to discard most of them is not as bizarre as it first seems. Using a pruned tree to obtain a rule instead of building it incrementally by adding conjunctions one at a time avoids a tendency to overprune that is a characteristic problem of the basic separate- and-conquer rule learner. Using the separate-and-conquer methodology in con- junction with decision trees adds flexibility and speed. It is indeed wasteful to build a full decision tree just to obtain a single rule, but the process can be accel- erated significantly without sacrificing the preceding advantages. The key idea is to build a partial decision tree instead of a fully explored one. A partial decision tree is an ordinary decision tree that contains branches to

(b)

DL: see text G = p[log(p/t) − log(P/T)] W = p +^1 t + 2 A =

p + n′ T ; accuracy for this rule p = number of positive examples covered by this rule (true positives) n = number of negative examples covered by this rule (false negatives) t = p + n; total number of examples covered by this rule n′ = N – n; number of negative examples not covered by this rule (true negatives) P = number of positive examples of this class N = number of negative examples of this class T = P + N; total number of examples of this class

Figure 6.4(continued)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Obtaining rules from partial decision trees

Get our desktop app

Company

Features

Documentation

Resources