Understanding Machine Learning: From Theory to Algorithms

30.1 Compression Bounds 411

theorem30.2 Letkbe an integer and letB:Zk→ Hbe a mapping from
sequences ofkexamples to the hypothesis class. Letm≥ 2 kbe a training set
size and letA:Zm→ Hbe a learning rule that receives a training sequenceS
of sizemand returns a hypothesis such thatA(S) =B(zi 1 ,...,zik)for some
(i 1 ,...,ik)∈[m]k. LetV={zj:j /∈(i 1 ,...,ik)}be the set of examples which
were not selected for definingA(S). Then, with probability of at least 1 −δover
the choice ofSwe have

LD(A(S)) ≤ LV(A(S)) +

√

LV(A(S))

4 klog(m/δ) m

+

8 klog(m/δ) m

.

Proof For anyI∈[m]klethI=B(zi 1 ,...,zik). Letn=m−k. Combining
Lemma 30.1 with the union bound we have

P

[

∃I∈[m]ks.t. LD(hI)−LV(hI)≥

√

2 LV(hI) log(1/δ) n +

4 log(1/δ) n

]

≤

∑

I∈[m]k

P

[

LD(hI)−LV(hI)≥

√

2 LV(hI) log(1/δ) n

+4 log(1/δ) n

]

≤mkδ.

Denoteδ′ =mkδ. Using the assumptionk ≤m/2, which implies thatn=
m−k≥m/2, the above implies that with probability of at least 1−δ′we have
that

LD(A(S)) ≤LV(A(S)) +

√

LV(A(S))^4 klog(m/δ

′)

m

+^8 klog(m/δ

′)

m

,

which concludes our proof.

As a direct corollary we obtain:

corollary30.3 Assuming the conditions of Theorem 30.2, and further as-
suming thatLV(A(S)) = 0, then, with probability of at least 1 −δover the choice
ofSwe have

LD(A(S)) ≤

8 klog(m/δ) m

.

These results motivate the following definition:

definition30.4 (Compression Scheme)LetHbe a hypothesis class of
functions fromXtoYand letkbe an integer. We say thatHhas a compression
scheme of sizekif the following holds:
For allmthere existsA:Zm→[m]kandB:Zk→Hsuch that for allh∈H,
if we feed any training set of the form (x 1 ,h(x 1 )),...,(xm,h(xm)) intoAand
then feed (xi 1 ,h(xi 1 )),...,(xik,h(xik)) intoB, where (i 1 ,...,ik) is the output
ofA, then the output ofB, denotedh′, satisfiesLS(h′) = 0.

It is possible to generalize the definition for unrealizable sequences as follows.

Understanding Machine Learning: From Theory to Algorithms

LD(A(S)) ≤ LV(A(S)) +

√

LV(A(S))

+

.

[

√

]

≤

∑

P

[

√

]

√

′)

′)

,

.

Get our desktop app

Company

Features

Documentation

Resources