Factor Analysis and Principal Components Analysis 257
we can represent the variance of the ith data time series Xi as a weighted
sum of the eigenvalues (each eigenvalue is equal to the variance of the rela-
tive principal component) as follows:
var()var()XVVV
XVii1111 21281811=+++
=+
λλ λλλ222 888181 282888VV
XVVV
ii++=+++
λvar()λλ λ(12.17)
Step 5: Using Only principal Components with Largest Variances From equation
(12.16), we see that in our illustration there are more than two orders of
magnitude (>100) between the smallest and the largest eigenvalues, and that
there is a rapid decay of the magnitude of eigenvalues after the first three
eigenvalues. Therefore, we can represent data approximately using only a
reduced number of principal components that have the largest variance.
Equivalently, this means using only those principal components that cor-
respond to the largest eigenvalues.
Suppose we use only four principal components. We can write the fol-
lowing approximate representation:
XPCV PCVXPCV PCV
XPC
ii i1515 81855 888≈++
≈++
≈
558 VP 58 ++ CV 88
(12.18)
or
XPCV PCVeXPiiCV PCVeii1515 818155 88=+++
=+++
XP 85 =+CV 85 ++PC 88 Ve 88(12.19)
where e represents the approximation error. The error terms are linear
combinations of the first four principal components. Therefore, they are
orthogonal to the last four principal components but, in general, they will
be mutually correlated. To see this point, consider, for example,
XP 15 =+CV 15 ++PC 81 Ve 81 and XP 85 =+CV 85 ++PC 88 Ve 88