4.3 DIVIDE-AND-CONQUER: CONSTRUCTING DECISION TREES 103
Table 4.6 gives the weather data with this extra attribute. Branching on ID
codeproduces the tree stump in Figure 4.5. The information required to specify
the class given the value of this attribute is
which is zero because each of the 14 terms is zero. This is not surprising: the ID
codeattribute identifies the instance, which determines the class without any
ambiguity—just as Table 4.6 shows. Consequently, the information gain of this
attribute is just the information at the root, info([9,5]) =0.940 bits. This is
greater than the information gain of any other attribute, and so ID codewill
inevitably be chosen as the splitting attribute. But branching on the identifica-
tion code is no good for predicting the class of unknown instances and tells
nothing about the structure of the decision, which after all are the twin goals of
machine learning.
info 0,1([])+info 0,1([])+info 1, 0([])++... info 1, 0([])+info 0,1([]),
Table 4.6 The weather data with identification codes.
ID code Outlook Temperature Humidity Windy Play
a sunny hot high false no
b sunny hot high true no
c overcast hot high false yes
d rainy mild high false yes
e rainy cool normal false yes
f rainy cool normal true no
g overcast cool normal true yes
h sunny mild high false no
i sunny cool normal false yes
j rainy mild normal false yes
k sunny mild normal true yes
l overcast mild high true yes
m overcast hot normal false yes
n rainy mild high true no
no no yes yes no
ID code
a b c ... m n
Figure 4.5Tree stump for the ID codeattribute.