Appendix B Measure Concentration
LetZ 1 ,...,Zmbe an i.i.d. sequence of random variables and letμbe their mean.
The strong law of large numbers states that whenmtends to infinity, the em-
pirical average,m^1
∑m
i=1Zi, converges to the expected valueμ, with probability
- Measure concentration inequalities quantify the deviation of the empirical
average from the expectation whenmis finite.
B.1 Markov’s Inequality
We start with an inequality which is called Markov’s inequality. LetZbe a
nonnegative random variable. The expectation ofZcan be written as follows:
E[Z] =
∫∞
x=0
P[Z≥x]dx. (B.1)
SinceP[Z≥x] is monotonically nonincreasing we obtain
∀a≥ 0 , E[Z]≥
∫a
x=0
P[Z≥x]dx≥
∫a
x=0
P[Z≥a]dx=aP[Z≥a]. (B.2)
Rearranging the inequality yields Markov’s inequality:
∀a≥ 0 , P[Z≥a]≤
E[Z]
a
. (B.3)
For random variables that take value in [0,1], we can derive from Markov’s
inequality the following.
lemmaB.1 LetZbe a random variable that takes values in[0,1]. Assume that
E[Z] =μ. Then, for anya∈(0,1),
P[Z > 1 −a]≥
μ−(1−a)
a
.
This also implies that for everya∈(0,1),
P[Z > a]≥
μ−a
1 −a
≥μ−a.
Proof LetY= 1−Z. ThenY is a nonnegative random variable withE[Y] =
1 −E[Z] = 1−μ. Applying Markov’s inequality onY we obtain
P[Z≤ 1 −a] =P[1−Z≥a] =P[Y≥a]≤
E[Y]
a
=
1 −μ
a
.
Understanding Machine Learning,©c2014 by Shai Shalev-Shwartz and Shai Ben-David
Published 2014 by Cambridge University Press.
Personal use only. Not for distribution. Do not post.
Please link tohttp://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning