19.6 Exercises 265
is consistent (with respect to the hypothesis class of all functions fromRdto
{ 0 , 1 }). A good presentation of the analysis is given in the book of Devroye et al.
(1996). Here, we give a finite sample guarantee that explicitly underscores the
prior assumption on the distribution. See Section 7.4 for a discussion on con-
sistency results. Finally, Gottlieb, Kontorovich & Krauthgamer (2010) derived
another finite sample bound for NN that is more similar to VC bounds.
19.6 Exercises
In this exercise we will prove the following theorem for thek-NNrule.
theorem19.5 LetX= [0,1]d,Y={ 0 , 1 }, andDbe a distribution overX×Y
for which the conditional probability function,η, is ac-Lipschitz function. LethS
denote the result of applying the k-NN rule to a sampleS∼Dm, wherek≥ 10.
Leth?be the Bayes optimal hypothesis. Then,
E
S
[LD(hS)]≤
(
1 +
√
8
k
)
LD(h?) +
(
6 c
√
d+k
)
m−^1 /(d+1).
- Prove the following lemma.
lemma19.6 LetC 1 ,...,Crbe a collection of subsets of some domain set,
X. LetSbe a sequence ofmpoints sampled i.i.d. according to some probability
distribution,DoverX. Then, for everyk≥ 2 ,
S∼DEm
∑
i:|Ci∩S|<k
P[Ci]
≤^2 rk
m
.
Hints:
ES
∑
i:|Ci∩S|<k
P[Ci]
=
∑r
i=1
P[Ci]PS[|Ci∩S|< k].
- Fix someiand suppose thatk <P[Ci]m/2. Use Chernoff’s bound to show
that
P
S
[|Ci∩S|< k]≤P
S
[|Ci∩S|<P[Ci]m/2]≤e−P[Ci]m/^8.
- Use the inequality maxaae−ma≤me^1 to show that for suchiwe have
P[Ci]P
S
[|Ci∩S|< k]≤P[Ci]e−P[Ci]m/^8 ≤^8
me
.
- Conclude the proof by using the fact that for the casek≥P[Ci]m/2 we
clearly have:
P[Ci]P
S
[|Ci∩S|< k]≤P[Ci]≤^2 k
m
.