Bandit Algorithms

27.6 Exercises 308

27.5 (Changing action sets) Provide the necessary corrections to
Algorithm 15 and its analysis to prove the result claimed in Note 4.

Hint You will need to choose a new exploration distribution in every round. Otherwise everything is more-or-less the same. 27.6(Covering numbers for convex sets) ForK ⊂ Rdlet ‖x‖K = supy∈K|〈x,y〉|. LetA⊂RdandL={y:‖y‖A≤ 1 }. LetN(A,ε) be the size of the smallest subsetC ⊆Asuch thatminx′∈C‖x−x′‖L≤εfor allx∈A. Show the following:

(a) WhenA=

{

x∈Rd:‖x‖V− 1 ≤ 1

}

we haveN(A,ε)≤(3/ε)d.
(b) WhenAis convex, bounded and span(A) =Rdwe haveN(A,ε)≤(3d/ε)d.
(c) For any boundedA⊂Rdwe haveN(A,ε)≤(6d/ε)d.

Hint For the first part find a linear map fromAto the euclidean ball and use the fact that the euclidean ball can be covered with a set of size (3/ε)d. For the second part use the fact that for any symmetric, convex and compact setKthere exists an ellipsoidE={x:‖x‖V≤ 1 }such thatE ⊆K⊆dE. 27.7(Low rank action sets) In the definition of the algorithm and the proof of Theorem 27.1 we assumed thatAspansRdand that it has positive Lebesgue measure. Show that this assumption may be relaxed by carefully adapting the algorithm and analysis.

27.8(Necessity of exploration) We saw in Chapter 11 that the exponential
weights algorithm achieved near-optimal regret without mixing additional
exploration. Show that exploration is crucial here. More precisely, construct a
finite action setAand reward sequenceyt∈Lsuch that the regret of Algorithm 15
withγ= 0 leads to a very bad algorithm relative to the optimal choice.

27.9(Continuous exponential weights) Complete the missing steps in the proof of Theorem 27.3.

27.10(Low rank action sets (ii)) In the definition of the algorithm and the proof of Theorem 27.3 we assumed thatAspansRdand that it has positive Lebesgue measure. Show that this assumption may be relaxed by carefully adapting the algorithm and analysis.

27.11(Volume bounds) Prove Proposition 27.4.

Bandit Algorithms

{

}

Get our desktop app

Company

Features

Documentation

Resources