300 The Basics of financial economeTrics
forecasts are never performed on the training data, clearly the methodology
is tested on the same data on which the models are learned.
Other problems with data snooping stem from the psychology of mod-
eling. A key precept that helps to avoid biases is the following: Modeling
hunches should be based on theoretical reasoning and not on looking at
the data. This statement might seem inimical to an empirical enterprise, an
example of the danger of “clear reasoning” mentioned above. Still, it is true
that by looking at data too long one might develop hunches that are sample-
specific. There is some tension between looking at empirical data to discover
how they behave and avoiding capturing the idiosyncratic behavior of the
available data.
Clearly simplicity (i.e., having only a small number of parameters to
calibrate) is a virtue in modeling. A simple model that works well should
be favored over a complex model that might produce unpredictable results.
Nonlinear models in particular are always subject to the danger of unpre-
dictable chaotic behavior.
Model Risk
As we have seen, any model choice might result in biases and poor perfor-
mance. In other words, any model selection process is subject to model risk.
One might well ask if it is possible to mitigate model risk. In statistics, there
is a long tradition, initiated by the eighteenth-century English mathematician
Thomas Bayes, of considering uncertain not only individual outcomes but the
probability distribution itself. It is therefore natural to see if ideas from Bayes-
ian statistics and related concepts could be applied to mitigate model risk.
A simple idea that is widely used in practice is to take the average of
different models. This idea can take different forms. There are two principal
reasons for applying model risk mitigation. First, we might be uncertain
as to which model is best, and so mitigate risk by diversification. Second,
perhaps more cogent, we might believe that different models will perform
differently under different circumstances. By averaging, the modeler hopes
to reduce the volatility of the model’s forecasts. It should be clear that aver-
aging model results or working to produce an average model (i.e., averaging
coefficients) are two different techniques. The level of difficulty involved is
also different.
Averaging results is a simple matter. One estimates different models with
different techniques, makes forecasts, and then averages the forecasts. This
simple idea can be extended to different contexts. For example, in a financial
econometric model developed for rating stocks, the modeler might want to
do an exponential averaging over past ratings, so that the proposed rating