Bandit Algorithms

(Jeff_L) #1
5.6 Exercises 82

5.14(Bernstein’s inequality) LetX 1 ,...,Xnbe a sequence of independent
random variables withXt−E[Xt]≤balmost surely andSn=

∑n
t=1(Xt−E[Xt])
andVn=

∑n
t=1V[Xt].

(a) Show thatg(x) =^12 +3!x+x


2
4!+···= (exp(x)−^1 −x)/x

(^2) is increasing.
(b)LetXbe a random variable withE[X] = 0 andX≤balmost surely. Show
thatE[exp(X)]≤1 +g(b)V[X].
(c) Prove that (1 +α) log(1 +α)−α≥^3 α
2
6+2αfor allα≥0.
(d) Letε >0 andα=bε/Vand prove that


P


(n

t=1

(Xt−E[Xt])≥ε

)


≤exp

(


−Vn
b^2

((1 +α) log(1 +α)−α)

)


≤exp

(


− ε

2
2 V

(


1 + 3 bεV

)


)


. (5.10)


(e) Use the previous result to show that

P



Sn≥

√√


√√


2


∑n

t=1

V[Xt] log

(


1


δ

)


+


2 b
3 n

log

(


1


δ

)


≤δ.

(f)What can be said if X 1 ,...,Xn are Gaussian? Discuss empirically or
theoretically whether or not a dependence onbis avoidable or not.

The bound in Eq. (5.10) is calledBernstein’s inequality. There are several
generalizations, the most notable of which is the martingale version that
slightly relaxes the independence assumption. Martingale techniques appear
in Chapter 20. Another useful variant (under slightly different conditions)
replaces the actual variance with the empirical variance. This is useful when
the variance is unknown. For more see the papers by Audibert et al. [2007],
Mnih et al. [2008], Maurer and Pontil [2009].

5.15 LetX 1 ,X 2 ,...,Xnbe a sequence of random variables adapted to filtration
F= (Ft)t. AbbreviateEt[·] =E[·|Ft] andμt=Et− 1 [Xt]. Suppose thatη > 0
satisfiesηXt≤1 almost surely. Prove that

P


(n

t=1

(Xt−μt)≥η

∑n

t=1

Et− 1 [X^2 t] +^1
η

log

(


1


δ

))


≤δ.

Hint Use the Cramer-Chernoff method and the fact thatexp(x)≤1 +x+x^2
for allx≤1 and exp(x)≥1 +xfor allx.
5.16 LetX 1 ,...,Xnbe independent random variables withP(Xt≤x)≤xfor
Free download pdf