Bandit Algorithms

(Jeff_L) #1
32.1 Click models 364

k > m. This model is richer than the document-based model, which is recovered
by choosingχ(k) =I{k≤m}. The number of parameters in the position-based
models ism+`.

Cascade model
The position-based model is not suitable for applications where clicking on an
item takes the user to a different page. In thecascade modelit is assumed that
the learner scans the shortlisted items in order and only clicks on the first item
they find attractive. Defineχ:A×[`]→[0,1] by


χ(a,k) =








1 ifk= 1
0 ifk > m
∏k− 1
k′=1(1−α(a(k′))) otherwise,

which is the probability that the user has not clicked on the firstk−1 items.
Then the cascade model assumes that


v(a,k) =α(a(k))χ(a,k). (32.1)

The first term in the factorization is the attractiveness function, which measures
the probability that the user is attracted to theith item. The second term can be
interpreted as the probability that the user examines that item. This interpretation
is also valid in the position-based model. It is important to emphasize thatv(a,k)
is the probability of clicking on thekth position when taking actiona∈ A.
This does not mean thatCt 1 ,...,Ct`are independent. The assumptions only
restricts the marginal distribution of eachCti, which is sufficient for our purposes.
Nevertheless, in the cascade model it would be standard to assume thatCtAt(k)= 0
if there exists ank′< ksuch thatCtAt(k′)= 1 and otherwise


P(CtAt(k)= 1|At,CtAt(1)= 0,...,CtAt(k−1)= 0) =I{k≤m}α(At(k)).

Like the document-based model, the cascade model has`parameters.

Generic model
We now introduce a model that generalizes the last three. Previous models
essentially assumed that the probability of a click factorizes into an attractiveness
probability and an examination probability. We deviate from this norm by making
assumptions directly on the functionv. Givenα: [`]→[0,1], an actionais
calledα-optimal if the shortlisted items are themmost attractive sorted by
attractiveness:α(a(k)) = maxk′≥kα(a(k′)) for allk∈[m].


Assumption32.1.There exists an attractiveness functionα: []→[0,1] such that the following four conditions are satisfied. Leta∈Aandi,j,k∈[] be such
thatα(i)≥α(j) and letσbe the permutation that exchangesiandj.


(a)v(a,k) = 0 for allk > m.
(b)


∑m
k=1v(a

∗,k) = maxa∈A∑m
k=1v(a,k) for allα-optimal actionsa

∗.

Free download pdf