Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1
6.3 Examples 71

6.3.2 Intervals


LetHbe the class of intervals overR, namely,H={ha,b:a,b∈R,a < b},

whereha,b:R→ { 0 , 1 }is a function such thatha,b(x) = (^1) [x∈(a,b)]. Take the set
C={ 1 , 2 }. Then,HshattersC(make sure you understand why) and therefore
VCdim(H)≥ 2. Now take an arbitrary setC={c 1 ,c 2 ,c 3 }and assume without
loss of generality thatc 1 ≤c 2 ≤c 3. Then, the labeling (1, 0 ,1) cannot be obtained
by an interval and thereforeHdoes not shatterC. We therefore conclude that
VCdim(H) = 2.


6.3.3 Axis Aligned Rectangles


LetHbe the class of axis aligned rectangles, formally:

H={h(a 1 ,a 2 ,b 1 ,b 2 ):a 1 ≤a 2 andb 1 ≤b 2 }

where

h(a 1 ,a 2 ,b 1 ,b 2 )(x 1 ,x 2 ) =

{

1 ifa 1 ≤x 1 ≤a 2 andb 1 ≤x 2 ≤b 2
0 otherwise

(6.2)

We shall show in the following that VCdim(H) = 4. To prove this we need
to find a set of 4 points that are shattered byH, and show that no set of 5
points can be shattered byH. Finding a set of 4 points that are shattered is
easy (see Figure 6.1). Now, consider any setC⊂R^2 of 5 points. InC, take a
leftmost point (whose first coordinate is the smallest inC), a rightmost point
(first coordinate is the largest), a lowest point (second coordinate is the smallest),
and a highest point (second coordinate is the largest). Without loss of generality,
denoteC ={c 1 ,...,c 5 }and letc 5 be the point that was not selected. Now,
define the labeling (1, 1 , 1 , 1 ,0). It is impossible to obtain this labeling by an
axis aligned rectangle. Indeed, such a rectangle must containc 1 ,...,c 4 ; but in
this case the rectangle containsc 5 as well, because its coordinates are within
the intervals defined by the selected points. So,Cis not shattered byH, and
therefore VCdim(H) = 4.

c 1

c 2

c 3

c 4 c 5

Figure 6.1Left: 4 points that are shattered by axis aligned rectangles. Right: Any axis
aligned rectangle cannot labelc 5 by 0 and the rest of the points by 1.
Free download pdf