Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1
evant to the outcome. Such domains, however, are the exception rather than the
rule. In most domains some attributes are irrelevant, and some relevant ones
are less important than others. The next improvement in instance-based learn-
ing is to learn the relevance of each attribute incrementally by dynamically
updating feature weights.
In some schemes, the weights are class specific in that an attribute may be
more important to one class than to another. To cater for this, a description is
produced for each class that distinguishes its members from members of all
other classes. This leads to the problem that an unknown test instance may be
assigned to several different classes, or to no classes at all—a problem that is all
too familiar from our description of rule induction. Heuristic solutions are
applied to resolve these situations.
The distance metric incorporates the feature weights w 1 ,w 2 ,...,wnon each
dimension:

In the case of class-specific feature weights, there will be a separate set of weights
for each class.
All attribute weights are updated after each training instance is classified, and
the most similar exemplar (or the most similar exemplar of each class) is used
as the basis for updating. Call the training instance xand the most similar exem-
plar y.For each attribute i,the difference |xi-yi|is a measure of the contribu-
tion of that attribute to the decision. If this difference is small then the attribute
contributes positively, whereas if it is large it may contribute negatively. The
basic idea is to update the ith weight on the basis of the size of this difference
and whether the classification was indeed correct. If the classification is correct
the associated weight is increased and if it is incorrect it is decreased, the amount
of increase or decrease being governed by the size of the difference: large if the
difference is small and vice versa. The weight change is generally followed by a
renormalization step. A simpler strategy, which may be equally effective, is to
leave the weights alone if the decision is correct and if it is incorrect to increase
the weights for those attributes that differ most greatly, accentuating the differ-
ence. Details of these weight adaptation algorithms are described by Aha (1992).
A good test of whether an attribute weighting method works is to add irrel-
evant attributes to all examples in a dataset. Ideally, the introduction of irrele-
vant attributes should not affect either the quality of predictions or the number
of exemplars stored.

Generalizing exemplars


Generalized exemplars are rectangular regions of instance space, called hyper-
rectanglesbecause they are high-dimensional. When classifying new instances it

wx y 12 11 wx y wx ynn n

2
2

2
22

2 2 2
( - )+-( )++ -... ( ).

238 CHAPTER 6| IMPLEMENTATIONS: REAL MACHINE LEARNING SCHEMES

Free download pdf