Stephen G. Hall and James Mitchell 227
the KLIC, such as the Berkowitz (2001) LR test, the KLIC can also be minimized
by searching for those weights that minimizeLRB. For other goodness-of-fit tests
the relevant test statistic can again be minimized, but since the direct link with
the KLIC is lost, the weights that deliver this minimum cannot be interpreted as
KLIC minimizing. This is because KLIC minimization using tests based on the pits
is only as good as the underlying goodness-of-fit test.
This methodology for combining density forecasts is designed to try and mimic
the optimal combination of point forecasts. It is motivated by the desire to obtain
the most “accurate” density forecast, in a statistical sense, as measured by the KLIC.
The KLIC minimizing weights,w∗, are the maximum likelihood (ML) esti-
mates of the weights in (5.40). These ML estimates, requiring iteration via the
EM algorithm, are given as (see Hamilton, 1994, p. 688):
w∗i=
1
T
∑T
t= 1
g(yt|it−h)wi
p(yt|t−h)
, (5.47)
wherewiis the probability at the previous iteration that the data are generated
by theith density. This ML interpretation may be helpful to move from inspec-
tion of combination weights to tests of their statistical significance by accounting
for their uncertainty using the inverse of the Hessian matrix. This might facil-
itate tests for “conditional efficiency” (encompassing) of forecastirelative to its
competitors, tests which have yet to be applied to density forecasts, although a
definition introduced by Clemen, Murphy and Winkler (1995) has been discussed
by Timmermann (2006, p. 176).
5.5.4.1 Bayesian Model Averaging (BMA)
From a Bayesian perspective, the KLIC minimizing weights, (5.46), based on the
logarithmic score, have some superficial similarities with a BMA approach. Geweke
and Amisano (2008) explain the differences. Hoetinget al.(1999), Koop (2003,
Ch. 11) and Geweke and Whiteman (2006) provide recent general discussions of
BMA methods.
BMA offers a conceptually elegant means of dealing with model uncertainty.
BMA is an application of Bayes’ theorem; model uncertainty is incorporated into
the theorem by treating the set of modelsSas an additional parameter and then
integrating overS, whereS≡{Si,i=1,...,N}and the modelsSiare defined as
continuous density functionsg(yt|it−h)for the variable of interestyt. BMA,
especially approximate, methods are also feasible, unlike iterative methods such as
(5.46), even for largeN.
Specifically,p(yt|t−h)can be interpreted as the posterior density ofytgiven
“data”t−hand written like (5.40) as:
pBMA(yt|t−h)=
∑N
i= 1
wiBMAg(yt|it−h), (5.48)
whereg(yt|it−h)=Pr(yt|Si,t−h), and the weightswiBMAare the model’s
posterior probabilities. As shown by Draper (1995) and Hoetinget al.(1999), these