76 The VC-Dimension
To ensure that the preceding is at mostwe need thatm≥2 dlog(m)
(δ)^2+
2 dlog(2e/d)
(δ)^2.
Standard algebraic manipulations (see Lemma A.2 in Appendix A) show that a
sufficient condition for the preceding to hold is thatm≥ 42 d
(δ)^2log(
2 d
(δ)^2)
+
4 dlog(2e/d)
(δ)^2.
Remark 6.4 The upper bound onmUCH we derived in the proof Theorem 6.7
is not the tightest possible. A tighter analysis that yields the bounds given in
Theorem 6.8 can be found in Chapter 28.Proof of Theorem 6.11 *
We will start by showing thatE
S∼Dm[
sup
h∈H|LD(h)−LS(h)|]
≤
4 +
√
log(τH(2m))
√
2 m. (6.4)
Since the random variable suph∈H|LD(h)−LS(h)|is nonnegative, the proof of
the theorem follows directly from the preceding using Markov’s inequality (see
Section B.1).
To bound the left-hand side of Equation (6.4) we first note that for every
h∈ H, we can rewriteLD(h) =ES′∼Dm[LS′(h)], whereS′=z 1 ′,...,zm′ is an
additional i.i.d. sample. Therefore,S∼DEm[
sup
h∈H|LD(h)−LS(h)|]
=S∼DEm[
sup
h∈H∣∣
∣S′∼DEmLS′(h)−LS(h)∣∣
∣
]
.
A generalization of the triangle inequality yields
∣∣
∣S′∼DEm[LS′(h)−LS(h)]∣∣
∣≤S′∼DEm|LS′(h)−LS(h)|,and the fact that supermum of expectation is smaller than expectation of supre-
mum yieldssup
h∈H
S′∼DEm|LS′(h)−LS(h)|≤S′∼DEmsup
h∈H|LS′(h)−LS(h)|.Formally, the previous two inequalities follow from Jensen’s inequality. Combin-
ing all we obtainS∼DEm[
sup
h∈H|LD(h)−LS(h)|]
≤S,S′E∼Dm[
sup
h∈H|LS′(h)−LS(h)|]
=S,S′E∼Dm[
sup
h∈H1
m∣∣
∣∣
∣
∑mi=1(`(h,zi′)−`(h,zi))