14 The Basics of financial economeTrics
−1 and 1 where the sign indicates the direction of the linear dependence.
So, for example, a correlation coefficient of −1 implies that all pairs (x,y)
are located perfectly on a line with negative slope. This is important for
modeling the regression of one variable on the other. The strength of the
intensity of dependence, however, is unaffected by the sign. For a general
consideration, only the absolute value of the correlation is of importance.
This is essential in assessing the extent of usefulness of assuming a linear
relationship between the two variables.
When dealing with regression analysis, a problem may arise from data
that seemingly are correlated, but actually are not. This is expressed by
accidental comovements of components of the observations. This effect is
referred to as a spurious regression and is discussed further in Chapter 10.
Stock return example
As an example, we consider monthly returns of the S&P 500 stock index
for the period January 31, 1996, through December 31, 2003. The data are
provided in Table 2.1. This time span includes 96 observations. To illustrate
the linear dependence between the index and individual stocks, we take the
monthly stock returns of an individual stock, General Electric (GE), cover-
ing the same period. The data are also given in Table 2.1. The correlation
coefficient of the two series is rSPm&,500onthlyGE=0.7125 using the formula shown in
Appendix A. This indicates a fairly strong correlation in the same direction
between the stock index and GE. So, we can expect with some certainty that
GE’s stock moves in the same direction as the index. Typically, there is a
positive correlation between stock price movement and a stock index.
For comparison, we also compute the correlation between these two
series using weekly as well as daily returns from the same period. (The data
are not shown here.) In the first case, we have rSPw&,500eeklyGE=0.7616 while in the
latter, we have rSPdaily&, 500 GE=0.7660. This difference in value is due to the fact
that while the true correlation is some value unknown to us, the correlation
coefficient as a statistic depends on the sample data.
reGreSSION MODeL: LINear FUNCtIONaL reLatIONShIp
BetWeeN tWO VarIaBLeS
So far, we have dealt with cross-sectional bivariate data understood as being
coequal variables, x and y. Now we will present the idea of treating one vari-
able as a reaction to the other where the other variable is considered to be
exogenously given. That is, y as the dependent variable depends on the real-
ization of the explanatory variable, x, also referred to as the independent