426 Measure Concentration
Therefore,
P[X ̄≥]≤e−λ
∏
i
e
λ^2 (b−a)^2
8 m^2 =e−λ+λ
(^2) (b−a) 2
8 m.
Settingλ= 4m/(b−a)^2 we obtain
P[X ̄≥]≤e−
(^2 b−ma^2 )^2
.
Applying the same arguments on the variable−X ̄we obtain thatP[X ̄≤−]≤
e−
(^2 bm−a^2 )^2
. The theorem follows by applying the union bound on the two cases.
lemmaB.7 (Hoeffding’s lemma) LetXbe a random variable that takes values
in the interval[a,b]and such thatE[X] = 0. Then, for everyλ > 0 ,
E[eλX]≤e
λ^2 (b 8 −a)^2
.
Proof Sincef(x) =eλxis a convex function, we have that for everyα∈(0,1),
andx∈[a,b],
f(x)≤αf(a) + (1−α)f(b).
Settingα=bb−−xa∈[0,1] yields
eλx≤b−x
b−a
eλa+x−a
b−a
eλb.
Taking the expectation, we obtain that
E[eλX]≤
b−E[X]
b−a
eλa+
E[x]−a
b−a
eλb =
b
b−a
eλa−
a
b−a
eλb,
where we used the fact thatE[X] = 0. Denoteh=λ(b−a),p= b−−aa, and
L(h) =−hp+ log(1−p+peh). Then, the expression on the right-hand side of
the above can be rewritten aseL(h). Therefore, to conclude our proof it suffices
to show thatL(h)≤ h
2
8. This follows from Taylor’s theorem using the facts:
L(0) =L′(0) = 0 andL′′(h)≤ 1 /4 for allh.
B.5 Bennet’s and Bernstein’s Inequalities
Bennet’s and Bernsein’s inequalities are similar to Chernoff’s bounds, but they
hold for any sequence of independent random variables. We state the inequalities
without proof, which can be found, for example, in Cesa-Bianchi & Lugosi (2006).
lemmaB.8 (Bennet’s inequality) LetZ 1 ,...,Zmbe independent random vari-
ables with zero mean, and assume thatZi≤ 1 with probability 1. Let
σ^2 ≥
1
m
∑m
i=1
E[Zi^2 ].