Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1
B.7 Concentration ofχ^2 Variables 429

Finally, for all∈(0,3),


P[(1−)k≤Z≤(1 +)k]≥ 1 − 2 e−

(^2) k/ 6
.
Proof Let us writeZ=
∑k
i=1X
(^2) i whereXi∼N(0,1). To prove both bounds
we use Chernoff’s bounding method. For the first inequality, we first bound
E[e−λX
12
], whereλ >0 will be specified later. Sincee−a≤ 1 −a+a
2
2 for alla≥^0
we have that
E[e−λX
12
] ≤ 1 −λE[X 12 ] +
λ^2
2 E[X
4
1 ].
Using the well known equalities,E[X^21 ] = 1 andE[X^41 ] = 3, and the fact that
1 −a≤e−awe obtain that
E[e−λX
(^21)
] ≤ 1 −λ+^32 λ^2 ≤e−λ+^32 λ
2
.
Now, applying Chernoff’s bounding method we get that
P[−Z≥−(1−)k] =P


[

e−λZ≥e−(1−)kλ

]

≤e(1−)kλE

[

e−λZ

]

=e(1−)kλ

(

E

[

e−λX
12 ])k

≤e(1−)kλe−λk+

(^32) λ^2 k
=e−kλ+
3
2 kλ
2
.
Chooseλ=/3 we obtain the first inequality stated in the lemma.
For the second inequality, we use a known closed form expression for the
moment generating function of aχ^2 kdistributed random variable:
∀λ <^12 , E


[

eλZ

2 ]

= (1− 2 λ)−k/^2. (B.7)

On the basis of the equation and using Chernoff’s bounding method we have


P[Z≥(1 +)k)] =P

[

eλZ≥e(1+)kλ

]

≤e−(1+)kλE

[

eλZ

]

=e−(1+)kλ(1− 2 λ)−k/^2
≤e−(1+)kλekλ=e−kλ,

where the last inequality occurs because (1−a)≤e−a. Settingλ=/6 (which
is in (0, 1 /2) by our assumption) we obtain the second inequality stated in the
lemma.
Finally, the last inequality follows from the first two inequalities and the union
bound.

Free download pdf