Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

split early on using a particular attribute might turn out later to be ill consid- ered in light of how the tree develops below that node. To get around these prob- lems, a beam searchcould be used in which irrevocable commitments are not made but instead a set of several active alternatives—whose number is the beam width—are pursued in parallel. This will complicate the learning algorithm quite considerably but has the potential to avoid the myopia associated with a greedy search. Of course, if the beam width is not large enough, myopia may still occur. There are more complex search strategies that help to overcome this problem. A more general and higher-level kind of search bias concerns whether the search is done by starting with a general description and refining it, or by starting with a specific example and generalizing it. The former is called a general-to-specificsearch bias; the latter a specific-to-generalone. Many learning algorithms adopt the former policy, starting with an empty decision tree, or a very general rule, and specializing it to fit the examples. However, it is perfectly possible to work in the other direction. Instance-based methods start with a particular example and see how it can be generalized to cover nearby examples in the same class.

Overfitting-avoidance bias Overfitting-avoidance bias is often just another kind of search bias. But because it addresses a rather special problem, we treat it separately. Recall the disjunction problem described previously. The problem is that if disjunction is allowed, useless concept descriptions that merely summarize the data become possible, whereas if it is prohibited, some concepts are unlearnable. To get around this problem, it is common to search the concept space starting with the simplest concept descriptions and proceeding to more complex ones: simplest-first ordering. This biases the search toward simple concept descriptions. Using a simplest-first search and stopping when a sufficiently complex concept description is found is a good way of avoiding overfitting. It is some- times called forward pruningor prepruningbecause complex descriptions are pruned away before they are reached. The alternative,backward pruningor post- pruning,is also viable. Here, we first find a description that fits the data well and then prune it back to a simpler description that also fits the data. This is not as redundant as it sounds: often the only way to arrive at a simple theory is to find a complex one and then simplify it. Forward and backward pruning are both a kind of overfitting-avoidance bias. In summary, although generalization as search is a nice way to think about the learning problem, bias is the only way to make it feasible in practice. Dif- ferent learning algorithms correspond to different concept description spaces searched with different biases. This is what makes it interesting: different

34 CHAPTER 1| WHAT’S IT ALL ABOUT?

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Get our desktop app

Company

Features

Documentation

Resources