The relation between implied and realised volatility 221
in historical volatility. The results are striking and provide strong evidence for both the
unbiasedness and efficiency of both IV forecasts. First of all, from the comparison of
univariate and encompassing regressions, the inclusion of historical volatility does not
improve the goodness of fitaccording to the adjustedR^2. In fact, the slope coefficient
of historical volatility is not significantly different from zero at the 10% level in
the encompassing regressions (3), indicating that both call and put IV subsume all
the information contained in historical volatility. The slope coefficients of both call
and put IV are not significantly different from one at the 10% level and the joint
test of information content and efficiency (γ =0andβ=1) does not reject the
null hypothesis, indicating that both IV estimates are efficient and unbiased after a
constant adjustment.
In order to see if put IV has more predictive power than call IV, we test in aug-
mented regression (3) ifγ=0andβ=1. The joint testγ=0andβ=1 does not
reject the null hypothesis. We see that the slope coefficient of put IV is significantly
different from zero only at the 5% level, while the slope coefficient of call IV is not
significantly different from zero. As an additional test we regress ln(σc)on ln(σp)
(ln(σp)on ln(σc)) and retrieve the residuals. Then we run univariate regression (2)
for ln(σc)(ln(σp)) using as an additional explanatory variable the residuals retrieved
from the regression of ln(σc)on ln(σp)(ln(σp)on ln(σc)). The residuals are signifi-
cant only in the regression of ln(σr)on ln(σc), pointing to the fact that put IV contains
slightly more information on future realised volatility than call IV.
A possible concern is the problem of data snooping, which occurs when the
properties of a data set influence the choice of the estimator or test statistic (see
e.g., [7]) and may arise in a multiple regression model, when a large number of
explanatory variables are compared and the selection of the candidate variables is
not based on a financial theory (e.g., in [20] 3654 models are compared to a given
benchmark, in [18] 291 explanatory variables are used in a multiple regression). This
is not the case in our regressions, since (i) we do not have any parameter to estimate,
(ii) we use only three explanatory variables: historical volatility, call IV and put IV,
that are compared pairwise in the regressions and (iii) the choice has been made on the
theory that IV, being derived from option prices, is a forward-looking measure of ex
post realised volatility and is deemed as the market’s expectation of future volatility.
Finally, in order to test the robustness of our results and see if IV has been measured
with errors, we adopt an instrumental variable procedure and run a two-stage least
squares. The Hausman [11] specification test, reported in the last column of Table 2,
indicates that the errors in variables problem is not significant in univariate regressions
(2), in encompassing regressions (3) or in augmented regression (4).^1 Therefore we
can trust the OLS regression results.
In our sample both IV forecasts obtain almost the same performance, with put IV
marginally better than call IV. These results are very different from the ones obtained
both in [3] and in [9]. The difference can possibly be attributed to the option exercise
feature, which in our case is European and not American, and to the underlying index
(^1) In augmented regression (4) the instrumental variables procedure is used for the variable
ln(σp).