ing. An excellent book on machine learning from a statistical perspective is from
Hastie et al. (2001). This is quite a theoretically oriented work, and is beauti-
fully produced with apt and telling illustrations.
Pattern recognition is a topic that is closely related to machine learning, and
many of the same techniques apply. Duda et al. (2001) offer the second edition
of a classic and successful book on pattern recognition (Duda and Hart 1973).
Ripley (1996) and Bishop (1995) describe the use of neural networks for pattern
recognition. Data mining with neural networks is the subject of a book by Bigus
(1996) of IBM, which features the IBM Neural Network Utility Product that he
developed.
There is a great deal of current interest in support vector machines, which
we return to in Chapter 6. Cristianini and Shawe-Taylor (2000) give a nice intro-
duction, and a follow-up work generalizes this to cover additional algorithms,
kernels, and solutions with applications to pattern discovery problems in fields
such as bioinformatics, text analysis, and image analysis (Shawe-Taylor and
Cristianini 2004). Schölkopf and Smola (2002) provide a comprehensive intro-
duction to support vector machines and related kernel methods by two young
researchers who did their PhD research in this rapidly developing area.
1.7 FURTHER READING 39