Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

straint, and all training instances become support vectors. Conversely, ifeis large enough that the tube can enclose all the data, the error becomes zero, there is no tradeoff to make, and the algorithm outputs the flattest tube that encloses the data irrespective of the value ofC.

The kernel perceptron

In Section 4.6 we introduced the perceptron algorithm for learning a linear clas- sifier. It turns out that the kernel trick can also be used to upgrade this algorithm to learn nonlinear decision boundaries. To see this, we first revisit the linear case. The perceptron algorithm repeatedly iterates through the training data instance by instance and updates the weight vector every time one of these instances is misclassified based on the weights learned so far. The weight vector is updated simply by adding or subtracting the instance’s attribute values to or from it. This means that the final weight vector is just the sum of the instances that have been misclassified. The perceptron makes its predictions based on whether

is greater or less than zero—where wiis the weight for the ith attribute and ai the corresponding attribute value of the instance that we wish to classify. Instead, we could use

Here,a¢(j) is the jth misclassified training instance,a¢(j)iis its ith attribute value, and y(j) is its class value (either +1 or -1). To implement this we no longer keep track of an explicit weight vector: we simply store the instances misclassified so far and use the preceding expression to make a prediction. It looks like we’ve gained nothing—in fact, the algorithm is much slower because it iterates through all misclassified training instances every time a prediction is made. However, closer inspection of this formula reveals that it can be expressed in terms of dot products between instances. First, swap the sum- mation signs to yield

The second sum is just a dot product between two instances and can be written as

Âjyj()aa.¢(j)◊

Âjyj()Âia j a¢()i i.

ÂiÂjyja j a()¢()i i.

Âiwaii

222 CHAPTER 6| IMPLEMENTATIONS: REAL MACHINE LEARNING SCHEMES

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

The kernel perceptron

Get our desktop app

Company

Features

Documentation

Resources