Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1

330 Dimensionality Reduction


Then, with probability of at least 1 −δover a choice of a random matrixW∈Rn,d
such that each element ofWis distributed normally with zero mean and variance
of 1 /nwe have

sup
x∈Q

∣∣

∣∣‖Wx‖

2
‖x‖^2

− 1

∣∣

∣∣< .

Proof Combining Lemma 23.3 and the union bound we have that for every
∈(0,3):

P

[

sup
x∈Q

∣∣

∣∣‖Wx‖

2
‖x‖^2 −^1

∣∣

∣∣> 

]

≤ 2 |Q|e−

(^2) n/ 6
Letδdenote the right-hand side of the inequality; thus we obtain that
=



6 log(2|Q|/δ)
n

Interestingly, the bound given in Lemma 23.4 does not depend on the original
dimension ofx. In fact, the bound holds even ifxis in an infinite dimensional
Hilbert space.

23.3 Compressed Sensing


Compressed sensing is a dimensionality reduction technique which utilizes a prior
assumption that the original vector is sparse in some basis. To motivate com-
pressed sensing, consider a vectorx∈Rdthat has at mostsnonzero elements.
That is,
‖x‖ 0 def=|{i:xi 6 = 0}|≤s.
Clearly, we can compressxby representing it usings(index,value) pairs. Fur-
thermore, this compression is lossless – we can reconstructxexactly from thes
(index,value) pairs. Now, lets take one step forward and assume thatx=Uα,
whereαis a sparse vector,‖α‖ 0 ≤s, andUis a fixed orthonormal matrix. That
is,xhas a sparse representation in another basis. It turns out that many nat-
ural vectors are (at least approximately) sparse in some representation. In fact,
this assumption underlies many modern compression schemes. For example, the
JPEG-2000 format for image compression relies on the fact that natural images
are approximately sparse in a wavelet basis.
Can we still compressxinto roughlysnumbers? Well, one simple way to do
this is to multiplyxbyU>, which yields the sparse vectorα, and then represent
αby itss(index,value) pairs. However, this requires us first to “sense”x, to
store it, and then to multiply it byU>. This raises a very natural question: Why
go to so much effort to acquire all the data when most of what we get will be
thrown away? Cannot we just directly measure the part that will not end up
being thrown away?
Free download pdf