Factor Analysis and Principal Components Analysis 261
matrix. PCA per se does not assume any probabilistic model for the
data. Factor models, in contrast, assume a statistical model for the data.
To appreciate the difference, suppose we split our data into two parts,
one to estimate data and one to test the model. If data follow a factor
model, then the same model applies to both estimation and sample data.
There is, however, no reason to assume that PCA works with similar
parameters for both sets of data.
- Principal components are observable time series. In our example, they
are portfolios, while factors might be nonobservable. - Residuals of PCA will not, in general, be uncorrelated and therefore
equation (12.7) does not hold for PCA.
We might ask if principal components are an estimate of the factors of
factor models. Recall that thus far we have considered data sets that are
finite in both the time dimension (i.e., time series are formed by a finite num-
ber of time points) and the number of time series. Under these assumptions,
it has been demonstrated that principal components are a consistent esti-
mate of factors only in the case of scalar models (i.e., only if the variance of
residuals is the same for all residuals).^2 If the variances of residuals are not
all equal, then principal components are not a consistent estimate of factors.
Scalar models are the only case where finite factor models and PCA
coincide; in this case, we can estimate factors with principal components.
In all other cases (1) principal components analysis gives results similar to
but not identical with factor analysis and (2) principal components are not
consistent estimates of factors, though they might approximate factors quite
well. In the next section, we will see that principal components do approxi-
mate factors well in large factor models.
Approximate (Large) Factor Models
Thus far we have considered factor models where residuals are uncorre-
lated, and therefore the covariance between the data is due only to factors.
In this case, equation (12.7) ΣΨ=+BB' (where Ψ is a diagonal matrix)
holds. We can now ask if we can relax this assumption and accept that Ψ
is a nondiagonal matrix. This question is suggested by the fact that, in prac-
tice, large factor models, for example factor models of returns of realistic
data sets, do not yield a diagonal matrix of residuals. After estimating any
reasonable number of factors, residuals still exhibit cross correlations.
(^2) See Hans Schneeweiss and Hans Mathes, “Factor Analysis and Principal Compo-
nents,” Journal of Multivariate Analysis 55 (1995): 105–124.