Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1

412 Compression Bounds


definition30.5 (Compression Scheme for Unrealizable Sequences)
LetHbe a hypothesis class of functions fromXtoYand letkbe an integer.
We say thatHhas a compression scheme of sizekif the following holds:
For allmthere existsA:Zm→[m]kandB:Zk→Hsuch that for allh∈H,
if we feed any training set of the form (x 1 ,y 1 ),...,(xm,ym) intoAand then
feed (xi 1 ,yi 1 ),...,(xik,yik) intoB, where (i 1 ,...,ik) is the output ofA, then
the output ofB, denotedh′, satisfiesLS(h′)≤LS(h).

The following lemma shows that the existence of a compression scheme for
the realizable case also implies the existence of a compression scheme for the
unrealizable case.

lemma30.6 LetHbe a hypothesis class for binary classification, and assume
it has a compression scheme of sizekin the realizable case. Then, it has a
compression scheme of sizekfor the unrealizable case as well.

Proof Consider the following scheme: First, find an ERM hypothesis and denote
it byh. Then, discard all the examples on whichherrs. Now, apply the realizable
compression scheme on the examples that have not been removed. The output of
the realizable compression scheme, denotedh′, must be correct on the examples
that have not been removed. Sinceherrs on the removed examples it follows
that the error ofh′cannot be larger than the error ofh; henceh′is also an ERM
hypothesis.

30.2 Examples


In the examples that follows, we present compression schemes for several hy-
pothesis classes for binary classification. In light of Lemma 30.6 we focus on the
realizable case. Therefore, to show that a certain hypothesis class has a com-
pression scheme, it is necessary to show that there existA,B,andkfor which
LS(h′) = 0.

30.2.1 Axis Aligned Rectangles


Note that this is an uncountable infinite class. We show that there is a simple
compression scheme. Consider the algorithmAthat works as follows: For each
dimension, choose the two positive examples with extremal values at this dimen-
sion. DefineBto be the function that returns the minimal enclosing rectangle.
Then, fork= 2d, we have that in the realizable case,LS(B(A(S))) = 0.

30.2.2 Halfspaces


LetX=Rdand consider the class of homogenous halfspaces,{x7→sign(〈w,x〉) :
w∈Rd}.
Free download pdf