Understanding Machine Learning: From Theory to Algorithms

17 Multiclass, Ranking, and Complex Prediction Problems

Multiclass categorization is the problem of classifying instances into one of several possible target classes. That is, we are aiming at learning a predictorh:X →Y, whereYis a finite set of categories. Applications include, for example, catego- rizing documents according to topic (Xis the set of documents andYis the set of possible topics) or determining which object appears in a given image (Xis the set of images andYis the set of possible objects). The centrality of the multiclass learning problem has spurred the development of various approaches for tackling the task. Perhaps the most straightforward approach is a reduction from multiclass classification to binary classification. In Section 17.1 we discuss the most common two reductions as well as the main drawback of the reduction approach. We then turn to describe a family of linear predictors for multiclass problems. Relying on the RLM and SGD frameworks from previous chapters, we describe several practical algorithms for multiclass prediction. In Section 17.3 we show how to use the multiclass machinery for complex prediction problems in whichYcan be extremely large but has some structure on it. This task is often calledstructured output learning. In particular, we demon- strate this approach for the task of recognizing handwritten words, in whichY is the set of all possible strings of some bounded length (hence, the size ofYis exponential in the maximal length of a word). Finally, in Section 17.4 and Section 17.5 we discuss ranking problems in which the learner should order a set of instances according to their “relevance.” A typ- ical application is ordering results of a search engine according to their relevance to the query. We describe several performance measures that are adequate for assessing the performance of ranking predictors and describe how to learn linear predictors for ranking problems efficiently.

17.1 One-versus-All and All-Pairs

The simplest approach to tackle multiclass prediction problems is by reduction to binary classification. Recall that in multiclass prediction we would like to learn a functionh:X →Y. Without loss of generality let us denoteY={ 1 ,...,k}. In the One-versus-All method (a.k.a. One-versus-Rest) we trainkbinary clas-

Understanding Machine Learning,©c2014 by Shai Shalev-Shwartz and Shai Ben-David Published 2014 by Cambridge University Press. Personal use only. Not for distribution. Do not post. Please link tohttp://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning

Understanding Machine Learning: From Theory to Algorithms

17 Multiclass, Ranking, and Complex Prediction Problems

17.1 One-versus-All and All-Pairs

Get our desktop app

Company

Features

Documentation

Resources