Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1

98 Nonuniform Learnability


Prove a bound onLD(hS)−LD(h∗B) in terms ofB, the confidence parameter
δ, and the size of the training setm.


  • Note: Such bounds are known asoracle inequalitiesin the literature: We
    wish to estimate how good we are compared to a reference classifier (or
    “oracle”)h∗B.



  1. In this question we wish to show a No-Free-Lunch result for nonuniform learn-
    ability: namely, that, over any infinite domain, the class ofallfunctions is not
    learnable even under the relaxed nonuniform variation of learning.
    Recall that an algorithm,A,nonuniformly learnsa hypothesis classHif
    there exists a functionmNULH : (0,1)^2 ×H→Nsuch that, for every,δ∈(0,1)
    and for everyh∈H, ifm≥mNULH (,δ,h) then for every distributionD, with
    probability of at least 1−δover the choice ofS∼Dm, it holds that


LD(A(S))≤LD(h) +.

If such an algorithm exists then we say thatHisnonuniformly learnable.


  1. LetAbe a nonuniform learner for a classH. For eachn∈NdefineHAn=
    {h∈H:mNUL(0. 1 , 0. 1 ,h)≤n}. Prove that each such classHnhas a finite
    VC-dimension.

  2. Prove that if a classHis nonuniformly learnable then there are classesHn
    so thatH=



n∈NHnand, for everyn∈N, VCdim(Hn) is finite.


  1. LetHbe a class that shatters an infinite set. Then, for every sequence
    of classes (Hn:n∈N) such thatH=



n∈NHn, there exists somenfor
which VCdim(Hn) =∞.
Hint: Given a classHthat shatters some infinite setK, and a sequence of
classes(Hn:n∈N), each having a finite VC-dimension, start by defining
subsetsKn⊆K such that, for alln,|Kn|>VCdim(Hn)and for any
n 6 =m,Kn∩Km=∅. Now, pick for each suchKna functionfn:Kn→
{ 0 , 1 }so that noh∈Hnagrees withfnon the domainKn. Finally, define
f:X→{ 0 , 1 }by combining thesefn’s and prove thatf∈

(

H\


n∈NHn

)

.


  1. Construct a classH 1 of functions from the unit interval [0,1] to{ 0 , 1 }that
    is nonuniformly learnable but not PAC learnable.

  2. Construct a classH 2 of functions from the unit interval [0,1] to{ 0 , 1 }that
    is not nonuniformly learnable.

  3. In this question we wish to show that the algorithmMemorizeis a consistent
    learner for every class of (binary-valued) functions over any countable domain.
    LetXbe a countable domain and letDbe a probability distribution overX.

  4. Let{xi:i∈N}be an enumeration of the elements ofX so that for all
    i≤j,D({xi})≤D({xj}). Prove that


nlim→∞


i≥n

D({xi}) = 0.


  1. Given any >0 prove that there existsD>0 such that


D({x∈X:D({x})< D})< .
Free download pdf