310 The Basics of financial economeTrics
least amount of human inputs. A good example is the machine learning
method that uses computerized algorithms to discover the knowledge (pat-
tern or rule) inherent in data. Advances in modeling technology such as
artificial intelligence, neural network, and genetic algorithms fit into this
category. The beauty of this approach is its vast degree of freedom. As
explained in the previous chapter, there are none of the restrictions that are
often explicitly specified in traditional, linear, stationary models.
Of course, researchers should not rely excessively on the power of the
method itself. Learning is impossible without knowledge. Even if a researcher
wants to simply throw data into a financial econometric model and expect it
to spit out the answer, he or she needs to provide some background knowl-
edge, such as the justification and types of input variables. There are still
numerous occasions that require researchers to make justifiable decisions.
For example, a typical way of modeling stock returns is using the following
linear form:
Rit = a + b 1 tF (^1) it + b 2 tF (^2) it +... + bntFnit + εit (15.1)
where Rit= excess return (over a benchmark return) for the ith security
in period t
Fjit=jth factor return value for the ith security in period t
bnt= the market-wide payoff for factor k in period t
εit=error (idiosyncratic) term in period t
trade-Off between Better estimations and prediction errors
Undoubtedly, in testing and estimating equation (15.1), the first task is to
decide which and how many explanatory variables should be included. This
decision should not be a question whether the test is justified by an ex ante
hypothesis based on financial economics. Theories, in financial economics,
however, are often developed with abstract concepts that need to be mea-
sured by alternative proxies. The choice of proper proxies, while getting
dangerously close to data snooping, makes the determination of both the
type and the number of explanatory variables an art rather than a science.
The choice of a particular proxy based on the rationale “Because it works!”
is not sufficient unless it is first backed up by the theory.
One rule of thumb is to be parsimonious. A big model is not necessarily
better, especially in the context of predictable risk-adjusted excess return.
While the total power of explanation increases with the number of vari-
ables (size) in the model, the marginal increase of explanatory power drops
quickly after some threshold. Whenever a new variable is introduced, what