Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1
much practical use in real-life problems because they rarely involve “closed”
worlds in which you can be certain that all cases are covered.
Neither table in Figure 2.1 is of any use without the family tree itself. This
tree can also be expressed in the form of a table, part of which is shown in Table
2.3. Now the problem is expressed in terms of two relationships. But these tables
do not contain independent sets of instances because values in the Name,
Parent1, and Parent2 columns of the sister-ofrelation refer to rows of the family
tree relation. We can make them into a single set of instances by collapsing the
two tables into the single one of Table 2.4.
We have at last succeeded in transforming the original relational problem
into the form of instances, each of which is an individual, independent example

46 CHAPTER 2| INPUT: CONCEPTS, INSTANCES, AND ATTRIBUTES


first
person

second
person

Peter
M

Peggy
F

= Grace
F

Ray
M

=

Pam
F

Ian
M

Steven =
M

Graham
M

Pippa
F

Brian
M

Anna
F

Nikki
F

Peter
Peter
...
Steven
Steven
Steven
Steven
...
lan
...
Anna
...
Nikki

Peggy
Steven
......
Peter
Graham
Pam
Grace
......
Pippa
......
Nikki
.....
Anna

sister
of?
no
no

no
no
yes
no

yes

yes

yes

first
person

second
person

Steven
Graham
lan
Brian
Anna
Nikki

Pam
Pam
Pippa
Pippa
Nikki
Anna

sister
of?
yes
yes
yes
yes
yes
yes
All the rest no

Figure 2.1A family tree and two ways of expressing the sister-of relation.
Free download pdf