Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

4.6 LINEAR MODELS 125

means that we don’t have to include an additional constant element in the sum. If the sum is greater than zero, we will predict the first class; otherwise, we will predict the second class. We want to find values for the weights so that the training data is correctly classified by the hyperplane. Figure 4.10(a) gives the perceptron learning rule for finding a separating hyperplane. The algorithm iterates until a perfect solution has been found, but it will only work properly if a separating hyperplane exists, that is, if the data is linearly separable. Each iteration goes through all the training instances. If a misclassified instance is encountered, the parameters of the hyperplane are changed so that the misclassified instance moves closer to the hyperplane or maybe even across the hyperplane onto the correct side. If the instance belongs to the first class, this is done by adding its attribute values to the weight vector; otherwise, they are subtracted from it.

Set all weights to zero Until all instances in the training data are classified correctly For each instance I in the training data If I is classified incorrectly by the perceptron If I belongs to the first class add it to the weight vector else subtract it from the weight vector

(a)

1 (“bias”)

attribute a 1

attribute a 2

attribute a 3

w 0 w 1 w 2 wk

(b)

Figure 4.10The perceptron: (a) learning rule and (b) representation as a neural network.

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Get our desktop app

Company

Features

Documentation

Resources