252 The Basics of financial economeTrics
and therefore are observable while factors are generally non observable
variables. There are other differences that will be made clear after describ-
ing PCA.
Step-by-Step pCa
The key idea of PCA is to find linear combinations of the data, called prin-
cipal components, that are mutually orthogonal and have maximum vari-
ance. The use of the term “orthogonal” requires explanation. Earlier in this
chapter we defined orthogonal factors as uncorrelated factors. But in defin-
ing PCA we do not, as mentioned, assume a statistical model for the data;
principal components are vectors of data without any associated probability
distribution. We say that two different principal components are orthogonal
if their scalar product as vectors is equal to zero.
Given two vectors xx=[] 1 , ...,'xN and yy=[] 1 , ...,'yN , of the same
length N and such that the average of their components is zero, their scalar
product is defined as:
=[]∑
=
=
xy xx...
y
y
(',) ,,N xy
N
ii
i
N
1
1
1
(12.12)
The reason we want to find principal components that are orthogonal
and have maximum variance is that, as will become clearer later in this sec-
tion, we want to represent data as linear combinations of a small number of
principal components. This objective is better reached if principal compo-
nents are orthogonal and have maximum variance.
Perhaps the simplest way to describe this process is to illustrate it
through a step-by-step example using the same data that were used to illus-
trate factor analysis. Let’s therefore consider the standardized data X and its
covariance matrix ΣX in equation (12.9).
Step 1: Compute eigenvalues and eigenvectors of the Covariance Matrix of data In
Appendix D we review the basics of matrix algebra. There we explain that
the eigenvalues and eigenvectors of the matrix ΣX are those vectors V and
those numbers λi that satisfy the condition ΣXiVV=λi. In general, an N × N
covariance matrix has N distinct, real-valued eigenvalues and eigenvectors.
We can therefore form a matrix V whose columns are the eigenvectors and
a diagonal matrix D with the eigenvalues on the main diagonal.
Eigenvalues and eigenvectors are computed using a statistical or math-
ematical package. Let’s compute the eigenvalues and eigenvectors of the