Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1
C.4 Singular Value Decomposition (SVD) 433

Finally, we show that ifA has rankr then it hasr orthonormal singular
vectors.


lemmaC.5 LetA∈Rm,nwith rankr. Define the following vectors:


v 1 = argmax
v∈Rn:‖v‖=1

‖Av‖

v 2 = argmax
v∈Rn:‖v‖=1
〈v,v 1 〉=0

‖Av‖

..

.

vr= argmax
v∈Rn:‖v‖=1
∀i<r,〈v,vi〉=0

‖Av‖

Then,v 1 ,...,vris an orthonormal set of right singular vectors ofA.


Proof First note that since the rank ofAisr, the range ofAis a subspace of
dimensionr, and therefore it is easy to verify that for alli= 1,...,r,‖Avi‖>0.
LetW∈Rn,nbe an orthonormal matrix obtained by the eigenvalue decompo-
sition ofA>A, namely,A>A=WDW>, withDbeing a diagonal matrix with
D 1 , 1 ≥D 2 , 2 ≥ ··· ≥0. We will show thatv 1 ,...,vrare eigenvectors ofA>A
that correspond to nonzero eigenvalues, and, hence, using Lemma C.4 it follows
that these are also right singular vectors ofA. The proof is by induction. For the
basis of the induction, note that any unit vectorvcan be written asv=Wx,
forx=W>v, and note that‖x‖= 1. Therefore,


‖Av‖^2 =‖AWx‖^2 =‖WDW>Wx‖^2 =‖WDx‖^2 =‖Dx‖^2 =

∑n

i=1

D^2 i,ixi^2.

Therefore,


max
v:‖v‖=1

‖Av‖^2 = max
x:‖x‖=1

∑n

i=1

Di,i^2 xi^2.

The solution of the right-hand side is to setx= (1, 0 ,...,0), which implies that
v 1 is the first eigenvector ofA>A. Since‖Av 1 ‖>0 it follows thatD 1 , 1 >0 as
required. For the induction step, assume that the claim holds for some 1≤t≤
r−1. Then, anyvwhich is orthogonal tov 1 ,...,vtcan be written asv=Wx
with all the firsttelements ofxbeing zero. It follows that


max
v:‖v‖=1,∀i≤t,v>vi=0

‖Av‖^2 = max
x:‖x‖=1

∑n

i=t+1

Di,i^2 xi^2.

The solution of the right-hand side is the all zeros vector exceptxt+1= 1. This
implies thatvt+1is the (t+ 1)th column ofW. Finally, since‖Avt+1‖>0 it
follows thatDt+1,t+1>0 as required. This concludes our proof.

Free download pdf