Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1
The real drawbacks of such techniques, however, are that they do not cope
well with noisy data, and they tend to be so slow as to be unusable on anything
but small artificial datasets. They are not covered in this book; see Bergadano
and Gunetti (1996) for a comprehensive treatment.
In summary, the input to a data mining scheme is generally expressed as a
table of independent instances of the concept to be learned. Because of this, it
has been suggested, disparagingly, that we should really talk offile miningrather
than database mining.Relational data is more complex than a flat file. A finite
set of finite relations can always be recast into a single table, although often at
enormous cost in space. Moreover, denormalization can generate spurious
regularities in the data, and it is essential to check the data for such artifacts
before applying a learning method. Finally, potentially infinite concepts can be
dealt with by learning rules that are recursive, although that is beyond the scope
of this book.

2.3 What’s in an attribute?


Each individual, independent instance that provides the input to machine
learning is characterized by its values on a fixed, predefined set of features or
attributes.The instances are the rows of the tables that we have shown for the
weather, contact lens, iris, and CPU performance problems, and the attributes
are the columns. (The labor negotiations data was an exception: we presented
this with instances in columns and attributes in rows for space reasons.)
The use of a fixed set of features imposes another restriction on the kinds of
problems generally considered in practical data mining. What if different

2.3 WHAT’S IN AN ATTRIBUTE? 49


Table 2.5 Another relation represented as a table.

First person Second person
Ancestor
Name Gender Parent1 Parent2 Name Gender Parent1 Parent2 of?


Peter male?? Steven male Peter Peggy yes
Peter male?? Pam female Peter Peggy yes
Peter male?? Anna female Pam Ian yes
Peter male?? Nikki female Pam Ian yes
Pam female Peter Peggy Nikki female Pam Ian yes
Grace female?? Ian male Grace Ray yes
Grace female?? Nikki female Pam Ian yes


other examples here yes
all the rest no
Free download pdf