Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1

122 Linear Predictors


Remark 9.1 The Perceptron is simple to implement and is guaranteed to con-
verge. However, the convergence rate depends on the parameterB, which in
some situations might be exponentially large ind. In such cases, it would be
better to implement the ERM problem by solving a linear program, as described
in the previous section. Nevertheless, for many natural data sets, the size ofB
is not too large, and the Perceptron converges quite fast.

9.1.3 The VC Dimension of Halfspaces


To compute the VC dimension of halfspaces, we start with the homogenous case.

theorem9.2 The VC dimension of the class of homogenous halfspaces inRd
isd.

Proof First, consider the set of vectorse 1 ,...,ed, where for everyithe vector
eiis the all zeros vector except 1 in thei’th coordinate. This set is shattered
by the class of homogenous halfspaces. Indeed, for every labelingy 1 ,...,yd, set
w= (y 1 ,...,yd), and then〈w,ei〉=yifor alli.
Next, letx 1 ,...,xd+1be a set ofd+ 1 vectors inRd. Then, there must exist
real numbersa 1 ,...,ad+1, not all of them are zero, such that

∑d+1
i=1aixi=^0.
LetI={i:ai> 0 }andJ={j:aj< 0 }. EitherIorJis nonempty. Let us
first assume that both of them are nonempty. Then,

i∈I

aixi=


j∈J

|aj|xj.

Now, suppose thatx 1 ,...,xd+1are shattered by the class of homogenous classes.
Then, there must exist a vectorwsuch that〈w,xi〉>0 for alli∈I while
〈w,xj〉<0 for everyj∈J. It follows that

0 <


i∈I

ai〈xi,w〉=



i∈I

aixi,w


=



j∈J

|aj|xj,w


=


j∈J

|aj|〈xj,w〉< 0 ,

which leads to a contradiction. Finally, ifJ(respectively,I) is empty then the
right-most (respectively, left-most) inequality should be replaced by an equality,
which still leads to a contradiction.

theorem9.3 The VC dimension of the class of nonhomogenous halfspaces in
Rdisd+ 1.

Proof First, as in the proof of Theorem 9.2, it is easy to verify that the set
of vectors 0 ,e 1 ,...,edis shattered by the class of nonhomogenous halfspaces.
Second, suppose that the vectorsx 1 ,...,xd+2are shattered by the class of non-
homogenous halfspaces. But, using the reduction we have shown in the beginning
of this chapter, it follows that there ared+ 2 vectors inRd+1that are shattered
by the class of homogenous halfspaces. But this contradicts Theorem 9.2.
Free download pdf