Bandit Algorithms

(Jeff_L) #1
33.6 Exercises 392

(a) Showα∗is continuous atν.
(b) Prove thatE[τν(ε)]<∞for allε >0.
(c) Prove thatE[τα(ε)]<∞for allε >0.
(d) Prove thatE[τT(ε)]<∞for allε >0.
(e) Prove that limδ→ 0 E[τ]/log(1/δ)≤c∗(ν).


Hint For (a) you may use the fact that for continuousf:X×Y→RwithX
andYtopological spaces withYcompact it holds thatg(x) =argmaxyf(x,y)
is continuous with respect to the Haussdorf metric. A solution is available or you
can read the original article [Garivier and Kaufmann, 2016].
33.8(Complexity measure comparison) Prove the following:

(a)LetL=^12 +


∑k
i=2

1
iand show thatH^2 (μ)≤H^1 (μ)≤LH^2 (μ). Combine this
with the fact thatL≤1 + log(k) to prove that Eq. (33.8) holds.
(b)Findμandμ′such thatH 2 (μ) =H 1 (μ) andH 1 (μ′) =LH 2 (μ′). Conclude
the inequalities in Eq. (33.8) are tight.


33.9(Analysis of sequential halving) The purpose of this exercise is to
prove Theorem 33.10. Assume without loss of generality thatμ=μ(ν) satisfies
μ 1 ≥μ 2 ≥...≥μk. Given a setA⊂[k] let

TopM(A,m) =



i∈[k] :


j≤i

I{j∈A}≤m




be the topmarms inA. To make life easier you may also assume thatkis a
power of two so that|A`|=k 21 −`andT`=n 2 `−^1 /log 2 (k).

(a) Prove that|AL+1|= 1.
(b) Letibe a suboptimal arm inAand suppose that 1∈A. Show that


P

(


μˆ` 1 ≤μˆ`i

∣∣


i∈A`, 1 ∈A`

)


≤exp

(



T`∆^2 i
4

)


.


(c)LetA′`=A`\TopM(A`,d|A`|/ 4 e) be the bottom three quarters of the arms
in round`. Show that if the optimal arm is eliminated after the`th phase,
then

N`=


i∈A′`

I


{


μˆ`i≥μˆ` 1

}



1


3 |A



`|.

(d) Leti= minA′and show that


E[N`|A`]≤|A′`|max
i∈A′`

exp

(


−∆


(^2) in 2 `− 1
4 log 2 (k)


)


≤|A′`|exp

(



n∆^2 i`
16 i`log 2 (k)

)


.


(e) Combine the previous two parts with Markov’s inequality to show that

P(1∈A/ `+1| 1 ∈A`)≤3 exp

(



T∆^2 i`
16 log 2 (k)i`

)


.

Free download pdf