Understanding Machine Learning: From Theory to Algorithms

246 Multiclass, Ranking, and Complex Prediction Problems

Once we have definedbas in Equation (17.13), we can easily derive a convex surrogate loss as follows. Assuming thaty∈V, we have that

∆(hw(x ̄),y) = ∆(b(hw( ̄x)),y)

≤ ∆(b(hw( ̄x)),y) +

∑r

i=1

(bi(hw( ̄x))−yi)〈w,xi〉

≤ maxv∈V

[

∆(v,y) +

∑r

i=1

(vi−yi)〈w,xi〉

]

. (17.14)

The right-hand side is a convex function with respect tow. We can now solve the learning problem using SGD as described in Section 17.2.5. The main computational bottleneck is calculating a subgradient of the loss function, which is equivalent to findingvthat achieves the maximum in Equa- tion (17.14) (see Claim 14.6). In the following we describe how to find this maximizer efficiently for any performance measure that can be written as a function of the numbersa, b, c, d given in Equation (17.11), and for which the setVcontains all elements in{± 1 }r for which the values ofa,bsatisfy some constraints. For example, for “recall at k” the setVis all vectors for whicha+b≤k. The idea is as follows. For anya,b∈[r], let

Y ̄a,b={v : |{i:vi= 1∧yi= 1}|=a∧ |{i:vi= 1∧yi=− 1 }|=b}.

Any vectorv∈V falls intoY ̄a,bfor somea,b∈[r]. Furthermore, ifY ̄a,b∩V is not empty for somea,b∈[r] thenY ̄a,b∩V=Y ̄a,b. Therefore, we can search within eachY ̄a,bthat has a nonempty intersection withV separately, and then take the optimal value. The key observation is that once we are searching only withinY ̄a,b, the value of ∆ is fixed so we only need to maximize the expression

max v∈Y ̄a,b

∑r

i=1

vi〈w,xi〉.

Suppose the examples are sorted so that〈w,x 1 〉 ≥ ··· ≥ 〈w,xr〉. Then, it is easy to verify that we would like to setvito be positive for the smallest indices i. Doing this, with the constraint ona,b, amounts to settingvi= 1 for thea top ranked positive examples and for thebtop-ranked negative examples. This yields the following procedure.

Understanding Machine Learning: From Theory to Algorithms

[

]

. (17.14)

Get our desktop app

Company

Features

Documentation

Resources