Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1
purchases? Should the manager move the most expensive, most profitable
diapers near the beer, increasing sales to harried fathers of a high-margin item
and add further luxury baby products nearby?
Of course, anyone who uses advanced technologies should consider the
wisdom of what they are doing. Ifdatais characterized as recorded facts, then
informationis the set of patterns, or expectations, that underlie the data. You
could go on to define knowledgeas the accumulation of your set of expectations
and wisdomas the value attached to knowledge. Although we will not pursue it
further here, this issue is worth pondering.
As we saw at the very beginning of this chapter, the techniques described in
this book may be called upon to help make some of the most profound and
intimate decisions that life presents. Data mining is a technology that we need
to take seriously.

1.7 Further reading


To avoid breaking up the flow of the main text, all references are collected in a
section at the end of each chapter. This first Further readingsection describes
papers, books, and other resources relevant to the material covered in Chapter


  1. The human in vitrofertilization research mentioned in the opening to this
    chapter was undertaken by the Oxford University Computing Laboratory,
    and the research on cow culling was performed in the Computer Science
    Department at the University of Waikato, New Zealand.
    The example of the weather problem is from Quinlan (1986) and has been
    widely used to explain machine learning schemes. The corpus of example prob-
    lems mentioned in the introduction to Section 1.2 is available from Blake et al.
    (1998). The contact lens example is from Cendrowska (1998), who introduced
    the PRISM rule-learning algorithm that we will encounter in Chapter 4. The iris
    dataset was described in a classic early paper on statistical inference (Fisher
    1936). The labor negotiations data is from the Collective bargaining review,a
    publication of Labour Canada issued by the Industrial Relations Information
    Service (BLI 1988), and the soybean problem was first described by Michalski
    and Chilausky (1980).
    Some of the applications in Section 1.3 are covered in an excellent paper that
    gives plenty of other applications of machine learning and rule induction
    (Langley and Simon 1995); another source of fielded applications is a special
    issue of the Machine Learning Journal(Kohavi and Provost 1998). The loan
    company application is described in more detail by Michie (1989), the oil slick
    detector is from Kubat et al. (1998), the electric load forecasting work is by
    Jabbour et al. (1988), and the application to preventative maintenance of
    electromechanical devices is from Saitta and Neri (1998). Fuller descriptions


1.7 FURTHER READING 37

Free download pdf