170 Forecast Combination and Encompassing
be attributed to sampling variability, or whether the differences are statistically
significant once sampling variability has been taken into account. If we assume
that the forecast errors are zero mean, normally distributed and serially uncor-
related (implying one-step-ahead forecasts), the Morgan–Granger–Newbold test is
uniformly most powerful unbiased. A test of equal accuracy that dispenses with the
restrictive assumptions that underpin this test, including that of squared-error loss,
is due to Diebold and Mariano (1995). Although assessing the relative accuracy of
forecasts is fundamental to forecast evaluation, from a practical or operational per-
spective the most useful way of doing this is not to test the null of equal accuracy
but to test whether one set of forecasts encompasses the rival set. A set of forecasts
is said to encompass a rival set if the rival set of forecasts do not contribute to a
statistically significant reduction in forecast loss when used in combination with
the original set of forecasts. Forecast encompassing is due to Chong and Hendry
(1986) and is an application of the principle of encompassing (see, e.g., Mizon and
Richard, 1986; Hendry and Richard, 1989) to the evaluation of forecasts, although
it is formally equivalent to the earlier notion of conditional efficiency of Nelson
(1972) and Granger and Newbold (1973).
The importance of testing for forecast encompassing (as opposed to equal accu-
racy) is that forecast combination is often found to improve forecast accuracy. That
is, a linear combination of two or more forecasts may often yield more accurate
forecasts than using a single forecast. If one is prepared to take a combination
of forecasts rather than requiring that only one is selected, then it matters little
whether one forecast is more or less accurate than another. Regression-based tests
of forecast encompassing are a way of testing whetherex posta linear combination
of forecasts results in a statistically significant reduction in (say) mean squared fore-
cast error relative to using either individual forecast. Such tests can also be used
as an indicator of when combinations might be usefulex ante. If neither set of
forecasts encompasses the other, then subsequent forecasts should be constructed
as a combination of those of the two individual sets.
If the goal is forecasting (as opposed to some other econometric endeavor, such as
building a model that describes or quantifies the relationships between economic
variables), then we suspect it would seldom be the case that the forecaster would be
unwilling to take a combination of the available forecasts and would instead insist
on selecting the single best forecasting model, or set of forecasts. Granger and Jeon
(2004) present forecast combination as an example of “thick modeling,” whereby
the investigator pools the values of estimates of interest (parameter estimates,
impulse responses, or forecasts, etc.) from a number of alternative specifications
rather than seeking to select a single specification. Forecast combination has a long
history: Granger and Jeon (2004) advocate the principle of thick modeling more
generally. The principle of encompassing would suggest forecast encompassing be
used to improve the forecasting model in the sense of re-specifying the model in
an attempt to match the process that generated the data as closely as possible.
However, it has been established that, even if this goal were attainable, it would
not necessarily be beneficial from a forecasting perspective when there are breaks
(see, e.g., Clements and Hendry, 2006).