Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1
2.4 Exercises 41

and letmbe an integer that satisfies


m≥log(|H|/δ)


Then, for any labeling function,f, and for any distribution,D, for which the
realizability assumption holds (that is, for someh∈ H,L(D,f)(h) = 0), with
probability of at least 1 −δover the choice of an i.i.d. sampleSof sizem, we
have that for every ERM hypothesis,hS, it holds that


L(D,f)(hS)≤.

The preceeding corollary tells us that for a sufficiently largem, the ERMHrule
over a finite hypothesis class will beprobably(with confidence 1−δ)approximately
(up to an error of) correct. In the next chapter we formally define the model
of Probably Approximately Correct (PAC) learning.


2.4 Exercises


1.Overfitting of polynomial matching:We have shown that the predictor
defined in Equation (2.3) leads to overfitting. While this predictor seems to
be very unnatural, the goal of this exercise is to show that it can be described
as a thresholded polynomial. That is, show that given a training setS =
{(xi,f(xi))}mi=1⊆(Rd× { 0 , 1 })m, there exists a polynomialpSsuch that
hS(x) = 1 if and only ifpS(x)≥0, wherehSis as defined in Equation (2.3).
It follows that learning the class of all thresholded polynomials using the ERM
rule may lead to overfitting.



  1. LetHbe a class of binary classifiers over a domainX. LetDbe an unknown
    distribution overX, and letfbe the target hypothesis inH. Fix someh∈H.
    Show that the expected value ofLS(h) over the choice ofS|xequalsL(D,f)(h),
    namely,
    E
    S|x∼Dm


[LS(h)] =L(D,f)(h).

3.Axis aligned rectangles: An axis aligned rectangle classifier in the plane
is a classifier that assigns the value 1 to a point if and only if it is inside a
certain rectangle. Formally, given real numbersa 1 ≤b 1 ,a 2 ≤b 2 , define the
classifierh(a 1 ,b 1 ,a 2 ,b 2 )by


h(a 1 ,b 1 ,a 2 ,b 2 )(x 1 ,x 2 ) =

{

1 ifa 1 ≤x 1 ≤b 1 anda 2 ≤x 2 ≤b 2
0 otherwise

. (2.10)

The class of all axis aligned rectangles in the plane is defined as
H^2 rec={h(a 1 ,b 1 ,a 2 ,b 2 ):a 1 ≤b 1 ,anda 2 ≤b 2 }.

Note that this is an infinite size hypothesis class. Throughout this exercise we
rely on the realizability assumption.
Free download pdf