Bandit Algorithms

(Jeff_L) #1
3.4 Notes 50

live a very long life! One application of Doob’s optional stopping theorem is a
useful and apriori surprising generalization of Markov’s inequality to nonnegative
supermartingales.

Theorem3.9 (Maximal inequality). Let(Xt)∞t=0be a supermartingale with
Xt≥ 0 almost surely for al lt. Then for anyε > 0 ,


P

(


sup
t∈N

Xt≥ε

)



E[X 0 ]


ε
Proof LetAnbe the event thatsupt≤nXt≥εandτ = (n+ 1)∧min{t≤
n:Xt≥ε}, where the minimum of an empty set is assumed to be infinite so
thatτ =n+ 1 ifXt< εfor all 0≤t≤n. Clearlyτ is a stopping time and
P(τ≤n+ 1) = 1. Then by Theorem 3.8 and elementary calculation,

E[X 0 ]≥E[Xτ]≥E[XτI{τ≤n}]≥E[εI{τ≤n}] =εP(τ≤n) =εP(An),

where the second inequality uses the definition of the stopping time and the
nonnegativity of the supermartingale. Rearranging shows thatP(An)≤E[X 0 ]/ε
for all n ∈ N. Since A 1 ⊆ A 2 ⊆ ...it follows that P(supt∈NXt≥ε) =
P(∪n∈NAn)≤E[X 0 ]/ε.


Markov’s inequality (which we will cover in the next chapter) combined
with the definition of a supermartingale shows thatP(Xn≥ε)≤E[X 0 ]/ε.
In fact, in the above we have effectively applied Markov’s inequality to the
random variableXτ. The maximal inequality is a strict improvement by
replacingXnwith supt∈NXtat no cost whatsoever.

A similar theorem holds for submartingales. You will provide a proof in
Exercise 3.8.

Theorem3.10.Let(Xt)nt=0be a submartingale withXt≥ 0 almost surely for
al lt. Then for anyε > 0 ,


P

(


max
t∈{ 0 , 1 ,...,n}

Xt≥ε

)



E[Xn]
ε

3.4 Notes


1 Some authors include in the definition of a stopping timeτthatP(τ <∞)= 1
and call random times without this propertyMarkov times. We donotadopt
this convention and allow stopping times to be infinite with nonzero probability.
Stopping times are also calledoptional times.
2 There are several notations for probability kernels depending on the application.
The following are commonly seen and equivalent:K(x,A) =K(A|x) =Kx(A).
For example, in statistics a parametric family is often given by{Pθ:θ∈Θ}
Free download pdf