Understanding Machine Learning: From Theory to Algorithms

98 Nonuniform Learnability

Prove a bound onLD(hS)−LD(h∗B) in terms ofB, the confidence parameter δ, and the size of the training setm.

Note: Such bounds are known asoracle inequalitiesin the literature: We
wish to estimate how good we are compared to a reference classifier (or
“oracle”)h∗B.

In this question we wish to show a No-Free-Lunch result for nonuniform learn-
ability: namely, that, over any infinite domain, the class ofallfunctions is not
learnable even under the relaxed nonuniform variation of learning.
Recall that an algorithm,A,nonuniformly learnsa hypothesis classHif
there exists a functionmNULH : (0,1)^2 ×H→Nsuch that, for every,δ∈(0,1)
and for everyh∈H, ifm≥mNULH (,δ,h) then for every distributionD, with
probability of at least 1−δover the choice ofS∼Dm, it holds that

LD(A(S))≤LD(h) +.

If such an algorithm exists then we say thatHisnonuniformly learnable.

LetAbe a nonuniform learner for a classH. For eachn∈NdefineHAn=
{h∈H:mNUL(0. 1 , 0. 1 ,h)≤n}. Prove that each such classHnhas a finite
VC-dimension.

Prove that if a classHis nonuniformly learnable then there are classesHn
so thatH=

⋃

n∈NHnand, for everyn∈N, VCdim(Hn) is finite.

LetHbe a class that shatters an infinite set. Then, for every sequence
of classes (Hn:n∈N) such thatH=

⋃

n∈NHn, there exists somenfor which VCdim(Hn) =∞. Hint: Given a classHthat shatters some infinite setK, and a sequence of classes(Hn:n∈N), each having a finite VC-dimension, start by defining subsetsKn⊆K such that, for alln,|Kn|>VCdim(Hn)and for any n 6 =m,Kn∩Km=∅. Now, pick for each suchKna functionfn:Kn→ { 0 , 1 }so that noh∈Hnagrees withfnon the domainKn. Finally, define f:X→{ 0 , 1 }by combining thesefn’s and prove thatf∈

(

H\

⋃

n∈NHn

)

.

Construct a classH 1 of functions from the unit interval [0,1] to{ 0 , 1 }that
is nonuniformly learnable but not PAC learnable.

Construct a classH 2 of functions from the unit interval [0,1] to{ 0 , 1 }that
is not nonuniformly learnable.

In this question we wish to show that the algorithmMemorizeis a consistent
learner for every class of (binary-valued) functions over any countable domain.
LetXbe a countable domain and letDbe a probability distribution overX.

Let{xi:i∈N}be an enumeration of the elements ofX so that for all
i≤j,D({xi})≤D({xj}). Prove that

nlim→∞

∑

i≥n

D({xi}) = 0.

Given any >0 prove that there existsD>0 such that

D({x∈X:D({x})< D})< .

Understanding Machine Learning: From Theory to Algorithms

⋃

⋃

(

H\

⋃

)

.

∑

Get our desktop app

Company

Features

Documentation

Resources