Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1

70 The VC-Dimension


Corollary 6.4 tells us that ifHshatters some setCof size 2mthen we cannot
learnHusingmexamples. Intuitively, if a setCis shattered byH, and we
receive a sample containing half the instances ofC, the labels of these instances
give us no information about the labels of the rest of the instances inC– every
possible labeling of the rest of the instances can be explained by some hypothesis
inH. Philosophically,

If someone can explain every phenomenon, his explanations are worthless.

This leads us directly to the definition of the VC dimension.

definition6.5 (VC-dimension) The VC-dimension of a hypothesis classH,
denoted VCdim(H), is the maximal size of a setC⊂ Xthat can be shattered
byH. IfHcan shatter sets of arbitrarily large size we say thatHhas infinite
VC-dimension.

A direct consequence of Corollary 6.4 is therefore:

theorem6.6 LetHbe a class of infinite VC-dimension. Then,His not PAC
learnable.

Proof SinceHhas an infinite VC-dimension, for any training set sizem, there
exists a shattered set of size 2m, and the claim follows by Corollary 6.4.

We shall see later in this chapter that the converse is also true: A finite VC-
dimension guarantees learnability. Hence, the VC-dimension characterizes PAC
learnability. But before delving into more theory, we first show several examples.

6.3 Examples


In this section we calculate the VC-dimension of several hypothesis classes. To
show that VCdim(H) =dwe need to show that


  1. There exists a setCof sizedthat is shattered byH.

  2. Every setCof sized+ 1 is not shattered byH.


6.3.1 Threshold Functions


LetHbe the class of threshold functions overR. Recall Example 6.2, where
we have shown that for an arbitrary setC ={c 1 },HshattersC; therefore
VCdim(H)≥1. We have also shown that for an arbitrary setC={c 1 ,c 2 }where
c 1 ≤c 2 ,Hdoes not shatterC. We therefore conclude that VCdim(H) = 1.
Free download pdf