228 Recent Developments in Density Forecasting
weights are given as:
wiBMA=Pr(Si|t−h)=
Pr(t−h|Si)Pr(Si)
∑N
i= 1
Pr(t−h|Si)Pr(Si)
, (5.49)
where all probabilities are implicitly conditional on the set of all modelsSunder
consideration.
The posterior probabilities,wiBMA, provide a natural means of ranking theN
models, which relates to the discussion above about the comparison of alternative
density forecasts.wiBMAindicate the probability that modeliis the best model in
a KLIC sense (see, e.g., Fernandez-Villaverde and Rubio-Ramirez, 2004).
Equal weights combination (see section 5.5.3) attaches equal (prior) weight to
each model with no updating of the weights based on the “data.”
A relationship between (5.47) and (5.49) is apparent when (i) Pr(Si)=wi, and (ii)
Pr(t− 1 |Si)=g(yt− 1 |it− 2 ), so that in both the “no parameters” and univariate
case the log density oft− 1 , conditional on modeli, equals the logarithmic score.^10
More generally, Pr(t− 1 |Si)is specified only up to unknown parameters (in fore-
casting modeli) and the logarithmicintegratedlikelihood can now be viewed as
the relevant scoring rule. Further, as discussed by Andersson and Karlsson (2007),
when combining forecasts from different multivariate models, in-sample measures
of fit based on the marginal likelihood for the system differ from measures of fore-
casting performance based on the logarithmic score for the variable of interest.
Geweke and Amisano (2008) discuss how the properties ofwiBMAdiffer from those
ofw∗i, considering the case when the “true” model is not in the set ofNmodels
under consideration. UnlikewiBMA,w∗ido not then necessarily tend to zero or one
asymptotically.
In practice, when the density forecasts are model-based, approximate Bayesian
methods based on information criteria are often used to proxywiBMA(see Garratt
et al., 2003; Garrattet al.,2006, Ch. 7; Kapetanios, Labhard and Price, 2008). These
methods measure the fit of the models, corrected in line with their parsimony,
such that:
wiBMA=
exp(i)
∑N
i= 1 exp(i)
(i=1,...,N), (5.50)
wherei=ICi−max(ICj)andICi=
∑T
t=h+ 1 lng(yt|it−h)−Kiis the infor-
mation criterion for modeli, such that
∑T
t=h+ 1 lng(yt|it−h)is the maximized
value of the log-likelihood (or logarithmic score) andKiis a penalty term for over-
parameterization. Thereforei=0 for the best density and is positive for the other
density forecasts; the largerithe less plausible is densityias the best density.
Popular choices are to setKiequal to the number of freely estimated parameters in
modeli(ki), so thatICiequals the Akaike criterion, or to setKi=(ki/ 2 )ln(T),so
thatICiequals the Schwarz Bayesian information criterion. The Schwarz weights
are asymptotically optimal when the true model lies in the set ofNmodels under
consideration; otherwise the Akaike weights are likely to perform better.wiBMAin
(5.50) can be interpreted as the probability that the model is the best approximation