Anon

(Dana P.) #1

82 The Basics of financial economeTrics


variable and let all other independent variables be equal to zero. Now, esti-
mate a regression merely with this particular independent variable and see
if the regression coefficient of this variable seems unreasonable because if its
sign is counterintuitive or its value appears too small or large, one may want
to consider removing that independent variable from the regression. The
reason may very well be attributable to multicollinearity. Technically, multi-
collinearity is caused by independent variables in the regression model that
contain common information. The independent variables are highly intercor-
related; that is, they have too much linear dependence. Hence the presence
of multicollinear independent variables prevents us from obtaining insight
into the true contribution to the regression from each independent variable.
Formally, the notion of perfect collinearity, which means that one or
more independent variables are a linear combination of the other indepen-
dent variables, can be expressed by the following relationship:


rank of (XTX) < k + 1 (4.1)


where the matrix X was defined in equation (3.4) in Chapter 3. Equation
(4.1) can be interpreted as X now consisting of vectors Xi, i = 1,... , k + 1.
In a very extreme case, two or more variables may be perfectly corre-
lated (i.e., their pairwise correlations are equal to one), which would imply
that some vectors of observations of these variables are merely linear com-
binations of others. The result of this would be that some variables are fully
explained by others and, thus, provide no additional information. This is
a very extreme case, however. In most problems in finance, the indepen-
dent data vectors are not perfectly correlated but may be correlated to a
high degree. In any case, the result is that, roughly speaking, the regres-
sion estimation procedure is confused by this ambiguity of data information
such that it cannot produce distinct regression coefficients for the variables
involved. The βi, i = 1,... , k cannot be identified; hence, an infinite number
of possible values for the regression coefficients can serve as a solution. This
can be very frustrating in building a reliable regression model.
We can demonstrate the problem with an example. Consider a regres-
sion model with three independent variables—X 1 , X 2 , and X3. Also assume
the following regarding these three independent variables:


XX 12 == 24 X 3

such that there is, effectively, just one independent variable, either X 1 , X 2 ,
or X 3. Now, suppose all three independent variables are erroneously used to
model the following regression


yXXX
XXX

=++

=++

βββ
βββ

11 22 33

(^42132333)

Free download pdf