We have seen the basic ideas of several machine learning methods and studied
in detail how to assess their performance on practical data mining problems.
Now we are well prepared to look at real, industrial-strength, machine learning
algorithms. Our aim is to explain these algorithms both at a conceptual level
and with a fair amount of technical detail so that you can understand them fully
and appreciate the key implementation issues that arise.
In truth, there is a world of difference between the simplistic methods
described in Chapter 4 and the actual algorithms that are widely used in prac-
tice. The principles are the same. So are the inputs and outputs—methods of
knowledge representation. But the algorithms are far more complex, principally
because they have to deal robustly and sensibly with real-world problems such
as numeric attributes, missing values, and—most challenging of all—noisy data.
To understand how the various methods cope with noise, we will have to draw
on some of the statistical knowledge that we learned in Chapter 5.
Chapter 4 opened with an explanation of how to infer rudimentary rules and
went on to examine statistical modeling and decision trees. Then we returned
chapter 6
Implementations:
Real Machine Learning Schemes
187