Factor Analysis and Principal Components Analysis 263
finite number of eigenvalues grows without bounds, all other eigenvalues
are bounded and residuals are correlated.
Note that APT can be rigorously defined only for an infinite market.
The theory of approximate factor models has been extended to allow a
more general setting where residuals and factors can be autocorrelated.^6
Approximate Factor Models and PCA
A key finding of the theory of approximate factor models is that factors of
approximate factor models are unique and can be estimated and identified
with principal components. Of course, a limit structure has to be defined.
In fact, per se, it does not make sense to define principal components of an
infinite market. However, one can define a limit process so that the limit of
principal components of growing markets coincides, in some sense, with
the limit of factors. Hence we can say that, in infinite approximate factor
models, factors are unique and can be estimated with principal components.
How can we realistically apply the theory of approximate factor
models? For example, how can we apply the theory of approximate factor
models to stock returns given that any market is finite? The answer is that
the theory of approximate factor models is a good approximation for large
factor models with a large number of long time series. For example, it is not
unusual at major investment management firms to work with a universe of
stocks that might include more than 1,000 return processes, each with more
than 1,000 daily returns.
When working with large models, global factors are associated with large
eigenvalues and local factors with small eigenvalues. The separation between
large and small eigenvalues is not as clear cut as the theoretical separation
between infinite and bounded eigenvalues. However, criteria have been pro-
posed to make the distinction highly reasonable. Some criteria are essentially
model selection criteria. Model selection criteria choose the optimal number
of factors as the optimal compromise between reducing the magnitude of the
residuals and the complexity of the model, that is, the number of parameters
to estimate. This is the strategy adopted in Bai and Ng.^7 Other criteria are
based on the distribution of the eigenvalues of large matrices, as in Onatsky.^8
(^6) Jushan Bai, “Inferential Theory for Factor Models of Large Dimensions,” Econo-
metrica 71 (2003): 135–171.
(^7) Jushan Bai and Serena Ng, “Determining the Number of Factors in Approximate
Factor Models,” Econometrica 70 (2002): 191–221.
(^8) Alexei Onatski, “Determining the Number of Factors from Empirical Distribution
of Eigenvalues,” Review of Economics and Statistics 92 (2010): 1004–1016.