B.D. McCullough 1317
authors pledge to provide their data and code to researchers wishing to replicate
the published results, due to the obvious incentive-compatibility problems. They
recommended a mandatory data/code archive, whereby authors would have to
deposit their data and code prior to publication. TheAmerican Economic Review,
which published the article, nevertheless adopted an honor system. McCullough
and Vinod (2003a) attempted to replicate every article in a single issue of theAmer-
ican Economic Reviewand discovered that half the authors would not honor their
pledges; the percentage of compliant authors at other journals with honor systems
was much less.
Under the then editor, Ben Bernanke, in direct response to McCullough and
Vinod (2003a), theAmerican Economic Reviewadopted a mandatory data/code
archive (Bernanke, 2004). Many journals followed suit:Econometrica, 2005;Review
of Economic Studies, 2005;Journal of Political Economy, 2006;Spanish Economic
Review, 2007;Canadian Journal of Economics, 2008; and theReview of Economics
and Statistics, 2009. More can be expected to follow. It should be noted that simply
having a mandatory data/code archive is no guarantee of replicable research being
published (see McCullough, McGeary and Harrison, 2006, 2008, for details). The
topic of replication in economics, including its relation to software, is covered in
great detail in Andersonet al.(2008).
As more data code and published results (read “inputs and alleged outputs”)
are available, more and more code will be run on more than one econometric
software package, both uncovering discrepancies that need to be resolved as well
as verifying that two different programs give the same answer to the same problem
(increasing our confidence that both programs are correct). In the case of uncovered
discrepancies, software developers are generally willing to fix these problems. The
net result will be more accurate software, as evidenced by some cases we have
discussed here: Drukker and Guan (2003), Zeileis and Kleiber (2005) and Bruno
and De Bonis (2004).
While some journals have always been willing to publish software reviews that
address accuracy issues, software reviews carry little professional credit, and until
recently practically no journal would publish articles on accuracy.Computational
Statistics and Data Analysis,Computational Statistics, theInternational Journal of Fore-
castingand theJournal of Statistical Softwarehave all published such articles in recent
years. So there are outlets for persons willing to do the computational work of cre-
ating benchmarks. And there will be a much greater need for it as progress on
the replication of economic research leads to the identification of fruitful areas for
developing such benchmarks.
Acknowledgments
The author would like to thank Houston Stokes for useful comments on this chapter.
Notes
- Because using the StRD is much more interesting than, and not nearly so tedious as testing
random number generators or statistical distributions, some of these authors only apply
the StRD.