262 Nearest Neighbor
Proof From the linearity of expectation, we can rewrite:ES
∑
i:Ci∩S=∅P[Ci]
=
∑ri=1P[Ci]ES[
(^1) [Ci∩S=∅]
]
.
Next, for eachiwe have
E
S[
(^1) [Ci∩S=∅]
]
=P
S[Ci∩S=∅] = (1−P[Ci])m≤e−P[Ci]m.Combining the preceding two equations we getE
S
∑
i:Ci∩S=∅P[Ci]
≤
∑ri=1P[Ci]e−P[Ci]m ≤ rmax
iP[Ci]e−P[Ci]m.Finally, by a standard calculus, maxaae−ma≤me^1 and this concludes the proof.Equipped with the preceding lemmas we are now ready to state and prove the
main result of this section – an upper bound on the expected error of the 1-NN
learning rule.theorem19.3 LetX= [0,1]d,Y={ 0 , 1 }, andDbe a distribution overX×Y
for which the conditional probability function,η, is ac-Lipschitz function. Let
hSdenote the result of applying the 1-NN rule to a sampleS∼Dm. Then,E
S∼Dm[LD(hS)]≤ 2 LD(h?) + 4c√
dm−1
d+1.Proof Fix some= 1/T, for some integerT, letr=Tdand letC 1 ,...,Crbe the
cover of the setXusing boxes of length: Namely, for every (α 1 ,...,αd)∈[T]d,
there exists a setCiof the form{x:∀j,xj∈[(αj−1)/T,αj/T]}. An illustration
ford= 2,T= 5 and the set corresponding toα= (2,4) is given in the following.11For eachx,x′in the same box we have‖x−x′‖≤√
d. Otherwise,‖x−x′‖≤√
d.
Therefore,xE,S[‖x−xπ^1 (x)‖]≤ES
P
⋃
i:Ci∩S=∅Ci
√
d+P
⋃
i:Ci∩S 6 =∅Ci
√
d
,
and by combining Lemma 19.2 with the trivial boundP[⋃
i:Ci∩S 6 =∅Ci]≤1 we
get thatxE,S[‖x−xπ^1 (x)‖]≤√
d(r
me+