Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1

74 The VC-Dimension


definition6.9 (Growth Function) LetHbe a hypothesis class. Then the
growth functionofH, denotedτH:N→N, is defined as

τH(m) = max
C⊂X:|C|=m

∣∣

HC

∣∣

.

In words,τH(m) is the number of different functions from a setCof sizemto
{ 0 , 1 }that can be obtained by restrictingHtoC.

Obviously, if VCdim(H) =dthen for anym≤dwe haveτH(m) = 2m. In
such cases,Hinduces all possible functions fromCto{ 0 , 1 }. The following beau-
tiful lemma, proposed independently by Sauer, Shelah, and Perles, shows that
whenmbecomes larger than the VC-dimension, the growth function increases
polynomially rather than exponentially withm.

lemma6.10 (Sauer-Shelah-Perles) LetHbe a hypothesis class withVCdim(H)≤
d <∞. Then, for allm,τH(m)≤

∑d
i=0

(m
i

)

. In particular, ifm > d+ 1then
τH(m)≤(em/d)d.


Proof of Sauer’s Lemma *


To prove the lemma it suffices to prove the following stronger claim: For any
C={c 1 ,...,cm}we have

∀H, |HC| ≤ |{B⊆C:HshattersB}|. (6.3)

The reason why Equation (6.3) is sufficient to prove the lemma is that if VCdim(H)≤
dthen no set whose size is larger thandis shattered byHand therefore

|{B⊆C:HshattersB}| ≤

∑d

i=0

(

m
i

)

.

Whenm > d+ 1 the right-hand side of the preceding is at most (em/d)d(see
Lemma A.5 in Appendix A).
We are left with proving Equation (6.3) and we do it using an inductive argu-
ment. Form= 1, no matter whatHis, either both sides of Equation (6.3) equal
1 or both sides equal 2 (the empty set is always considered to be shattered by
H). Assume Equation (6.3) holds for sets of sizek < mand let us prove it for
sets of sizem. FixHandC={c 1 ,...,cm}. DenoteC′={c 2 ,...,cm}and in
addition, define the following two sets:

Y 0 ={(y 2 ,...,ym) : (0,y 2 ,...,ym)∈HC∨(1,y 2 ,...,ym)∈HC},

and

Y 1 ={(y 2 ,...,ym) : (0,y 2 ,...,ym)∈HC∧(1,y 2 ,...,ym)∈HC}.

It is easy to verify that|HC|=|Y 0 |+|Y 1 |. Additionally, sinceY 0 =HC′, using
the induction assumption (applied onHandC′) we have that

|Y 0 |=|HC′|≤|{B⊆C′:HshattersB}|=|{B⊆C:c 1 6∈B∧HshattersB}|.
Free download pdf