Understanding Machine Learning: From Theory to Algorithms

424 Measure Concentration

monotonicity of the exponent function and Markov’s inequality, we have that for everyt > 0

P[Z >(1 +δ)p] =P[etZ> et(1+δ)p]≤

E[etZ] e(1+δ)tp

. (B.5)

Next,

E[etZ] =E[et

∑ iZi] =E[

∏

i

etZi]

=

∏

i

E[etZi] by independence

=

∏

i

(

piet+ (1−pi)e^0

)

=

∏

i

(

1 +pi(et−1)

)

≤

∏

i

epi(e

t−1) using 1 +x≤ex

=e

∑ ipi(et−1)

=e(e

t−1)p .

Combining the above with Equation (B.5) and choosingt= log(1 +δ) we obtain

lemmaB.3 LetZ 1 ,...,Zmbe independent Bernoulli variables where for every i,P[Zi= 1] =piandP[Zi= 0] = 1−pi. Letp=

∑m i=1piand letZ=

∑m i=1Zi. Then, for anyδ > 0 ,

P[Z >(1 +δ)p] ≤ e−h(δ)p,

where

h(δ) = (1 +δ) log(1 +δ)−δ.

Using the inequalityh(a)≥a^2 /(2 + 2a/3) we obtain

lemmaB.4 Using the notation of Lemma B.3 we also have

P[Z >(1 +δ)p] ≤ e−p 2+2δ^2 δ/ 3 .

For the other direction, we apply similar calculations:

P[Z <(1−δ)p] =P[−Z >−(1−δ)p] =P[e−tZ> e−t(1−δ)p]≤

E[e−tZ] e−(1−δ)tp

,(B.6)

Understanding Machine Learning: From Theory to Algorithms

. (B.5)

∏

=

∏

=

∏

(

)

=

∏

(

)

≤

∏

,(B.6)

Get our desktop app

Company

Features

Documentation

Resources