Understanding Machine Learning: From Theory to Algorithms

Appendix B Measure Concentration

LetZ 1 ,...,Zmbe an i.i.d. sequence of random variables and letμbe their mean. The strong law of large numbers states that whenmtends to infinity, the empirical average,m^1

∑m i=1Zi, converges to the expected valueμ, with probability

Measure concentration inequalities quantify the deviation of the empirical
average from the expectation whenmis finite.

B.1 Markov’s Inequality

We start with an inequality which is called Markov’s inequality. LetZbe a nonnegative random variable. The expectation ofZcan be written as follows:

E[Z] =

∫∞

x=0

P[Z≥x]dx. (B.1)

SinceP[Z≥x] is monotonically nonincreasing we obtain

∀a≥ 0 , E[Z]≥

∫a

x=0

P[Z≥x]dx≥

∫a

x=0

P[Z≥a]dx=aP[Z≥a]. (B.2)

Rearranging the inequality yields Markov’s inequality:

∀a≥ 0 , P[Z≥a]≤

E[Z]

a

. (B.3)

For random variables that take value in [0,1], we can derive from Markov’s inequality the following. lemmaB.1 LetZbe a random variable that takes values in[0,1]. Assume that E[Z] =μ. Then, for anya∈(0,1),

P[Z > 1 −a]≥

μ−(1−a) a

.

This also implies that for everya∈(0,1),

P[Z > a]≥

μ−a 1 −a

≥μ−a.

Proof LetY= 1−Z. ThenY is a nonnegative random variable withE[Y] = 1 −E[Z] = 1−μ. Applying Markov’s inequality onY we obtain

P[Z≤ 1 −a] =P[1−Z≥a] =P[Y≥a]≤

E[Y]

a

=

1 −μ a

.

Understanding Machine Learning,©c2014 by Shai Shalev-Shwartz and Shai Ben-David Published 2014 by Cambridge University Press. Personal use only. Not for distribution. Do not post. Please link tohttp://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning

Understanding Machine Learning: From Theory to Algorithms

Appendix B Measure Concentration

∫∞

E[Z]

. (B.3)

.

E[Y]

=

.

Get our desktop app

Company

Features

Documentation

Resources