218 Recent Developments in Density Forecasting
z∗t|t− 1 following an AR(1) process:z∗t|t− 1 =μ+ρz∗t− 1 |t− 2 +εt, whereεt∼N(0,σ^2 ).
The test statisticLRBis computed as:
LRB=− 2
[
L(0, 1, 0)−L(̂μ,̂σ^2 ,̂ρ)
]
, (5.26)
whereL(̂μ,̂σ^2 ,ρ)̂is the value of the exact log-likelihood of a Gaussian AR(1) model
(e.g., see Hamilton, 1994, p. 119). Under the nullLRB∼χ 32. The test can be
readily generalized to higher-order AR models; squared (and higher power) lagged
values ofz∗t|t− 1 can also be included in the model in an attempt to pick up non-
linear dependence. The test abstracts from parameter uncertainty, but is expected
to perform well in the small samples typical to macroeconomics; in contrast, the
nonparametric goodness-of-fit tests discussed above rely on larger samples.
Whenh>1, recognizing that this LR test is not designed to deal with depen-
dence, Clements (2004) used a two degrees-of-freedom LR test, which drops the
test for autocorrelation, to evaluate the Bank of England’s year-ahead (h=five
quarters) density forecasts. But this test still assumes independence in the con-
struction of the likelihood function; since the likelihood function is misspecified
a robust Wald or LM test might be considered instead (see White, 1982). Alterna-
tively, Dowd (2008) suggests that the dependence be mopped up by first fitting an
ARMA process toz∗t|t−h.
A criticism of these LR tests is the maintained assumption of normality. They only
have power to detect non-normality through the first two moments. Consequently,
some authors, such as Clements and Smith (2000) and Hall and Mitchell (2004),
have supplemented the LR test with a nonparametric normality test, such as the
Doornik–Hansen test. But, as Bao, Lee and Saltoglu (2007) explain, one can still
construct a Berkowitz-type LR test without maintaining the normality assumption.
They letεtfollow a more general distribution, specifically a semi-nonparametric
density, which nests normality. Alternatively, Chen and Fan (2004) generalize
Berkowitz (2001) by proposing the use of copula functions to design tests which
have power against a wider range of alternative processes. Berkowitz also proposed
a censored version of the LR test which focuses on the tails of the forecast den-
sity. Diks, Panchenko and van Dijk (2008) show that the censored LR test can be
biased when it is used tocomparealternative density forecasts, rather than just
test a given model for goodness-of-fit. Promising new joint tests, using autocon-
tours, have been developed by Gonzalez-Rivera, Senyuz and Yoldas (2007) which
are robust to parameter uncertainty.
The KLIC as the loss function Despite the apparent choice over which distributional
test to apply to thepit’s, which explains the variety used in extant applied work,
these evaluation tests can all be related to the KLIC. In particular, following Bao,
Lee and Saltoglu (2007), we consider how one of the most popular tests, namely
the Berkowitz (2001) LR test, can be directly related to the KLIC. The KLIC can
therefore be interpreted as the loss function for density forecast evaluation (see
Lee, 2007). As argued by Mitchell and Hall (2005), it offers a unifying framework