Bandit Algorithms

(Jeff_L) #1
10.5 Exercises 136

Hint Chooseε 1 ,ε 2 to decrease slowly withnand use the first part of the
theorem.

10.3(Concentration for bounded random variables) LetF= (Ft)tbe a
filtration, (Xt)tbe [0,1]-valued,F-adapted sequence, such thatE[Xt|Ft− 1 ]=μt
for someμ 1 ,...,μn ∈ [0,1] non-random numbers. Define μ =^1 n


∑n
t=1μt,
μˆ=n^1

∑n
t=1Xt. Prove that the conclusion of Lemma 10.3 still holds.
Hint Read note 2 at the end of this chapter. Letg(·,μ) be the cumulant
generating function of theμ-parameter Bernoulli distribution. ForX∼ B(μ),
λ∈R,g(λ,μ) =logE[exp(λX)]. Show thatg(λ,·) is concave. Next, use this and
the tower rule to show thatE[exp(λn(ˆμ−μ))]≤g(λ,μ)n.

The bound of the previous exercise is most useful when allμtare either
all close to 0 or they are all close to 1. When half of the{μt}are close to
zero and the other half close to one, then the bound degrades to Hoeffding’s
bound.

10.4(Exponential families) Lethbe a measure on (R,B(R)) andT:R→R
be Borel measurable. The functionTis called thesufficient statistic. Define
the probability measurePθon (R,B(R)) via its density with respect toh,


pθ(x) =

dPθ
dh(x) = exp(θT(x)−A(θ)),

whereA(θ) is thelog partition functiongiven by


A(θ) = log


R

exp(θT(x))dh(x).

Let Θ = {θ∈R:A(θ) exists}. The set M = {Pθ:θ∈Θ} is called an
exponential family. For more details see the note after the exercise. An
exponential family isregularif Θ is non-empty and open. It isnon-singularif
A′′(θ)>0 for allθ∈Θ.

(a) Prove thatPθis indeed a probability measure.
(b) LetEθdenote expectations with respect toPθ. Show thatA′(θ) =Eθ[T].
(c) Letθ∈Θ andX∼Pθ. Show that for allλwithλ+θ∈θ,
Eθ[exp(λT(X))] = exp(A(λ+θ)−A(θ)).


(d) Givenθ,θ′∈Θ, show that


d(θ,θ′) =Eθ

[


log

(


pθ(X)
pθ′(X)

)]


=A(θ′)−A(θ)−(θ′−θ)A′(θ).

(e)Letθ,θ′∈Θ be such thatA′(θ′)≥A′(θ) andX 1 ,...,Xnbe independent
and identically distributed andˆt=^1 n

∑n
t=1T(Xt). Show that
P


t≥A′(θ′)

)


≤exp (−nd(θ′,θ)).
Free download pdf