Bandit Algorithms

(Jeff_L) #1
27.3 Continuous exponential weights 305

play the biggest role in the feasibility of sampling from a measure are(a)what is
the form of the measure or its density and(b)how is the convex set represented.
As it happens, the measure defined in the last display islog-concave, which
means that the logarithm of the density, with respect to the Lebesgue-measure
onA, is a concave function.


Theorem27.2.Letp(a)∝IA(a)exp(−f(a))be a density with respect to the
Lebesgue measure onAsuch thatf:A →Ris a convex function. Then there
exists a polynomial-time algorithm for sampling frompprovided one can compute
the following efficiently:


1 (first-order information): ∇f(a)wherea∈A.
2 (Euclidean projections): argminx∈A‖x−y‖ 2 wherey∈Rd.
The probability distribution defined by Eq. (27.4) satisfies the first condition.
Efficiently computing a projection onto a convex set is a more delicate issue. A
general criteria that makes this efficient is access to aseparation oracle, which
is a computational procedureφthat accepts a pointx∈Rdas input and responds
φ(x) =trueifx∈Aand otherwiseφ(x) =uwhere〈y,u〉>〈x,u〉for ally∈A
(see Fig. 27.1).


A x

Figure 27.1Separation oracle returns the normal of a hyperplane that separatesxfrom
Awheneverx /∈A. Whenx∈A, the separation oracle returnstrue.

Theorem∫ 27.3. Assume thatAis compact, convex and has volumevol(A) =


Ada >^0. Then an appropriately tuned instantiation of the continuous
exponential weights algorithm with Kiefer-Wolfowitz exploration has regret bounded
by


Rn≤ 2


3 dn(1 +dlog(2n)).
The proof of Theorem 27.3 relies on the following upper bound on the
logarithmic Laplace transform of the uniform measure on a convex setK, which
we leave as an exercise (Exercise 27.11).


Proposition27.4. LetK ⊂Rdbe a compact convex set withvol(K)> 0 and
u∈Rdandx∗= argminx∈K〈x,u〉. Then

log

(


∫ vol(K)
Kexp (−〈x−x∗,u〉)dx

)


≤1 + max

(


0 , dlog

(


sup
x,y∈K

〈x−y,u〉)

))


.

Free download pdf