27 Covering Numbers
In this chapter we describe another way to measure the complexity of sets, which
is called covering numbers.
27.1 Covering
definition27.1 (Covering) LetA⊂Rmbe a set of vectors. We say thatA
isr-covered by a setA′, with respect to the Euclidean metric, if for alla∈A
there existsa′∈A′with‖a−a′‖ ≤r. We define byN(r,A) the cardinality of
the smallestA′thatr-coversA.
Example 27.1(Subspace) Suppose thatA⊂Rm, letc= maxa∈A‖a‖, and as-
sume thatAlies in ad-dimensional subspace ofRm. Then,N(r,A)≤(2c
√
d/r)d.
To see this, letv 1 ,...,vdbe an orthonormal basis of the subspace. Then, any
a∈Acan be written asa=
∑d
i=1αiviwith‖α‖∞≤ ‖α‖^2 =‖a‖^2 ≤c. Let
∈Rand consider the set
A′=
{d
∑
i=1
α′ivi:∀i,α′i∈{−c,−c+,−c+ 2,...,c}
}
Givena∈As.t.a=
∑d
i=1αiviwith‖α‖∞≤c, there existsa′∈A′such that
‖a−a′‖^2 =‖
∑
i
(α′i−αi)vi‖^2 ≤^2
∑
i
‖vi‖^2 ≤^2 d.
Choose=r/
√
d; then‖a−a′‖≤rand thereforeA′is anr-cover ofA. Hence,
N(r,A)≤|A′|=
(
2 c
)d
=
(
2 c
√
d
r
)d
27.1.1 Properties
The following lemma is immediate from the definition.
lemma27.2 For anyA⊂Rm, scalarc > 0 , and vectora 0 ∈Rm, we have
∀r > 0 , N(r,{ca+a 0 :a∈A})≤N(cr,A).
Understanding Machine Learning,©c2014 by Shai Shalev-Shwartz and Shai Ben-David
Published 2014 by Cambridge University Press.
Personal use only. Not for distribution. Do not post.
Please link tohttp://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning