B.D. McCullough 1305
28.5 Advanced tests
Look at the table of contents of an advanced econometrics text, or the list of func-
tions in any econometrics software package, and you will see many familiar names:
Kalman filtering, multinomial logit, ARMA, etc. Just as in the case of multivariate
GARCH, with which this chapter opened, different packages will give different
answers to the same problem, and no one has any idea which package, if any, is
correct. This is because there are no benchmarks for any of these procedures.
We have already alluded to different packages giving different answers to the
same FIML problem. In fact, a benchmark was worked out for this problem by
Calzolari and Panattoni (1988). Not many software developers were aware of this,
because it was not advertised as a benchmark. Silk (1996) recognized that it was
benchmark-quality work, and used it as a benchmark for his comparison of software
packages.
When Bollerslev (1986) published the first GARCH article, he did not completely
describe his method, leaving developers to guess at important details. Conse-
quently, while every package soon had a GARCH command, they all gave different
answers. McCullough and Renfro (1999) (henceforth MR) documented this fact,
but also sought to alleviate the problem. Fiorentini, Calzolari and Panattoni (1996)
published an article entitled “Analytic derivatives and the computation of GARCH
estimates.” Recognizing that two of the remaining three authors had already writ-
ten benchmark-quality code, MR suspected that this might be code of a similar
quality. Indeed, it was, and MR offered the “FCP GARCH benchmark” on which
many developers quickly converged. MR analyzed seven packages, and found that
four of them could not estimate the FCP GARCH model, and only one of the
remaining three was reasonably accurate. When Brooks, Burke and Persand (2001)
reassessed the situation only two years later, they found that most packages could
estimate the model and do so with reasonable accuracy (but see McCullough and
Vinod, 2003b, for another example of many packages giving different answers to
the same GARCH problem). Developers will converge on a benchmarkifone is
available.
Bruno and De Bonis (2004) wrote a benchmark for a garden-variety panel data
estimator, for both fixed and random effects. They then gave the data to three
software packages. All packages agreed on the fixed effects estimation, but disagreed
on the random effects. Investigation of the matter required correspondence with
the developers because the user guide and reference manuals did not provide much
information on the algorithms employed (which is typical of econometric software
packages). As Bruno and De Bonis (2004, p. 281) discovered, “it is clear that all
the numerical differences produced by the random-effects estimates are caused by
the differences in the small-sample formulas for the computation of the between-
regression variance.” All three packages used consistent estimators, so there was no
theoretical reason to prefer one over the other. The literature provided no guidance,
so Bruno and De Bonis conducted a Monte Carlo study to determine which of the
three estimators had the better finite-sample properties.
The Yule–Walker equations are often used to compute partial autocorrelation
coefficients; the method is presented in most econometrics texts, and most econo-
metric software packages offer the method. What is not presented is that it is the