Bandit Algorithms

5.5 Bibliographical remarks 78

An easy calculation shows thatMY(λ) = (1− 2 λ)−n/^2 forλ∈[0, 1 /2) and MY(λ) is undefined forλ≥ 1 /2. By the Cramer-Chernoff method we have

P(Y≥n+ε)≤ inf λ∈[0, 1 /2)

Mλ(Y) exp(−λ(n+ε))

≤ inf λ∈[0, 1 /2)

(

1

1 − 2 λ

)n 2 exp(−λ(n+ε))

Choosingλ=^12 −2(nn+ε)leads toP(Y≥n+ε)≤

(

1 +nε

)n 2 exp

(

−ε 2

)

, which turns out to be about the best you can do [Laurent and Massart, 2000].

5.5 Bibliographical remarks

We return to concentration of measure many times, but note here that it is an interesting (and still active) topic of research. What we have seen is only the tip of the iceberg. Readers who want to learn more about this exciting field might enjoy the book by Boucheron et al. [2013]. For matrix versions of many standard results there is a recent book by Tropp [2015]. The survey of McDiarmid [1998] has many of the classic results. There is a useful type of concentration bound that are ‘self-normalized’ by the variance. A nice book on this is by de la Pe ̃na et al. [2008]. Another tool that is occasionally useful for deriving concentration bounds in more unusual setups is calledempirical process theory. There are several references for this, including those by van de Geer [2000] or Dudley [2014].

5.6 Exercises

There are too many candidate exercises to list. We heartily recommendallthe exercises in Chapter 2 of the book by Boucheron et al. [2013]. 5.1(Variance of average) LetX 1 ,X 2 ,...,Xnbe a sequence of independent and identically distributed random variables with meanμand varianceσ^2 <∞. Let ˆμ=n^1

∑n t=1Xtand show thatV[ ˆμ] =E[(ˆμ−μ)^2 ] =σ^2 /n. 5.2(Markov’s inequality) Prove Markov’s inequality (Lemma 5.1).

5.3 Compare the Gaussian tail probability bound on the right-hand side of(5.4) and the one on(5.2). What values ofεmake one smaller than the other? Discuss your findings.

5.4 LetXbe a random variable onRwith density with respect to the Lebesgue measure ofp(x) =|x|exp(−x^2 /2)/2. Show that:

(a)P(|X|≥ε) = exp(−ε^2 /2). (b) ThatXis not

√

(2−ε)-subgaussian for anyε >0.

Bandit Algorithms

(

1

(

(

)

5.5 Bibliographical remarks

5.6 Exercises

√

Get our desktop app

Company

Features

Documentation

Resources