Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

We have seen the basic ideas of several machine learning methods and studied in detail how to assess their performance on practical data mining problems. Now we are well prepared to look at real, industrial-strength, machine learning algorithms. Our aim is to explain these algorithms both at a conceptual level and with a fair amount of technical detail so that you can understand them fully and appreciate the key implementation issues that arise. In truth, there is a world of difference between the simplistic methods described in Chapter 4 and the actual algorithms that are widely used in prac- tice. The principles are the same. So are the inputs and outputs—methods of knowledge representation. But the algorithms are far more complex, principally because they have to deal robustly and sensibly with real-world problems such as numeric attributes, missing values, and—most challenging of all—noisy data. To understand how the various methods cope with noise, we will have to draw on some of the statistical knowledge that we learned in Chapter 5. Chapter 4 opened with an explanation of how to infer rudimentary rules and went on to examine statistical modeling and decision trees. Then we returned

chapter 6

Implementations:

Real Machine Learning Schemes

187

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Implementations:

Get our desktop app

Company

Features

Documentation

Resources