Appendix A Technical Lemmas
lemmaA.1 Leta > 0. Then:x≥ 2 alog(a) ⇒ x≥alog(x). It follows that a
necessary condition for the inequalityx < alog(x)to hold is thatx < 2 alog(a).
Proof First note that fora∈(0,
√
e] the inequalityx≥alog(x) holds uncon-
ditionally and therefore the claim is trivial. From now on, assume thata >
√
e.
Consider the functionf(x) =x−alog(x). The derivative isf′(x) = 1−a/x.
Thus, forx > athe derivative is positive and the function increases. In addition,
f(2alog(a)) = 2alog(a)−alog(2alog(a))
= 2alog(a)−alog(a)−alog(2 log(a))
=alog(a)−alog(2 log(a)).
Sincea−2 log(a)>0 for alla >0, the proof follows.
lemmaA.2 Leta≥ 1 andb > 0. Then:x≥ 4 alog(2a)+2b⇒ x≥alog(x)+b.
Proof It suffices to prove thatx≥ 4 alog(2a) + 2bimplies that bothx ≥
2 alog(x) andx≥ 2 b. Since we assumea ≥1 we clearly have thatx≥ 2 b.
In addition, sinceb >0 we have thatx≥ 4 alog(2a) which using Lemma A.1
implies thatx≥ 2 alog(x). This concludes our proof.
lemmaA.3 LetXbe a random variable andx′∈Rbe a scalar and assume
that there existsa > 0 such that for allt≥ 0 we haveP[|X−x′|> t]≤ 2 e−t
(^2) /a 2
.
Then,E[|X−x′|] ≤ 4 a.
Proof For alli= 0, 1 , 2 ,...denoteti=ai. Sincetiis monotonically increasing
we have thatE[|X−x′|] is at most
∑∞
i=1tiP[|X−x
′|> ti− 1 ]. Combining this
with the assumption in the lemma we get thatE[|X−x′|] ≤ 2 a
∑∞
i=1ie
−(i−1)^2.
The proof now follows from the inequalities
∑∞
i=1
ie−(i−1)
2
≤
∑^5
i=1
ie−(i−1)
2
+
∫∞
5
xe−(x−1)
2
dx < 1 .8 + 10−^7 < 2.
lemmaA.4 LetXbe a random variable andx′∈Rbe a scalar and assume
that there existsa > 0 andb≥esuch that for allt≥ 0 we haveP[|X−x′|>
t]≤ 2 be−t
(^2) /a 2
. Then,E[|X−x′|] ≤ a(2 +
√
log(b)).
Understanding Machine Learning,©c2014 by Shai Shalev-Shwartz and Shai Ben-David
Published 2014 by Cambridge University Press.
Personal use only. Not for distribution. Do not post.
Please link tohttp://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning