Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1

238 Multiclass, Ranking, and Complex Prediction Problems


while the feature function Ψi,j, 2 can be written in terms of

φi,j, 2 (x,yt,yt− 1 ) = (^1) [yt=i] (^1) [yt− 1 =j].
Therefore, the prediction can be written as
hw(x) = argmax
y∈Y
∑r
t=1
〈w,φ(x,yt,yt− 1 )〉. (17.4)
In the following we derive a dynamic programming procedure that solves every
problem of the form given in Equation (17.4). The procedure will maintain a
matrixM∈Rq,rsuch that
Ms,τ= max
(y 1 ,...,yτ):yτ=s
∑τ
t=1
〈w,φ(x,yt,yt− 1 )〉.
Clearly, the maximum of〈w,Ψ(x,y)〉equals maxsMs,r. Furthermore, we can
calculateMin a recursive manner:
Ms,τ= maxs′ (Ms′,τ− 1 +〈w,φ(x,s,s′)〉). (17.5)
This yields the following procedure:
Dynamic Programming for Calculatinghw(x)as Given
in Equation (17.4)
input:a matrixx∈Rn,rand a vectorw
initialize:
foreachs∈[q]
Ms, 1 =〈w,φ(x,s,−1)〉
forτ= 2,...,r
foreachs∈[q]
setMs,τas in Equation (17.5)
setIs,τto be thes′that maximizes Equation (17.5)
setyt= argmaxsMs,r
forτ=r, r− 1 ,..., 2
setyτ− 1 =Iyτ,τ
output: y= (y 1 ,...,yr)


17.4 Ranking


Ranking is the problem of ordering a set of instances according to their “rele-
vance.” A typical application is ordering results of a search engine according to
their relevance to the query. Another example is a system that monitors elec-
tronic transactions and should alert for possible fraudulent transactions. Such a
system should order transactions according to how suspicious they are.
Formally, letX∗=

⋃∞

n=1X

nbe the set of all sequences of instances from
Free download pdf