3.4 Notes 50
live a very long life! One application of Doob’s optional stopping theorem is a
useful and apriori surprising generalization of Markov’s inequality to nonnegative
supermartingales.
Theorem3.9 (Maximal inequality). Let(Xt)∞t=0be a supermartingale with
Xt≥ 0 almost surely for al lt. Then for anyε > 0 ,
P
(
sup
t∈N
Xt≥ε
)
≤
E[X 0 ]
ε
Proof LetAnbe the event thatsupt≤nXt≥εandτ = (n+ 1)∧min{t≤
n:Xt≥ε}, where the minimum of an empty set is assumed to be infinite so
thatτ =n+ 1 ifXt< εfor all 0≤t≤n. Clearlyτ is a stopping time and
P(τ≤n+ 1) = 1. Then by Theorem 3.8 and elementary calculation,
E[X 0 ]≥E[Xτ]≥E[XτI{τ≤n}]≥E[εI{τ≤n}] =εP(τ≤n) =εP(An),
where the second inequality uses the definition of the stopping time and the
nonnegativity of the supermartingale. Rearranging shows thatP(An)≤E[X 0 ]/ε
for all n ∈ N. Since A 1 ⊆ A 2 ⊆ ...it follows that P(supt∈NXt≥ε) =
P(∪n∈NAn)≤E[X 0 ]/ε.
Markov’s inequality (which we will cover in the next chapter) combined
with the definition of a supermartingale shows thatP(Xn≥ε)≤E[X 0 ]/ε.
In fact, in the above we have effectively applied Markov’s inequality to the
random variableXτ. The maximal inequality is a strict improvement by
replacingXnwith supt∈NXtat no cost whatsoever.
A similar theorem holds for submartingales. You will provide a proof in
Exercise 3.8.
Theorem3.10.Let(Xt)nt=0be a submartingale withXt≥ 0 almost surely for
al lt. Then for anyε > 0 ,
P
(
max
t∈{ 0 , 1 ,...,n}
Xt≥ε
)
≤
E[Xn]
ε
3.4 Notes
1 Some authors include in the definition of a stopping timeτthatP(τ <∞)= 1
and call random times without this propertyMarkov times. We donotadopt
this convention and allow stopping times to be infinite with nonzero probability.
Stopping times are also calledoptional times.
2 There are several notations for probability kernels depending on the application.
The following are commonly seen and equivalent:K(x,A) =K(A|x) =Kx(A).
For example, in statistics a parametric family is often given by{Pθ:θ∈Θ}