Anon

(Dana P.) #1

Simple Linear Regression 17


over which we do not have any direct control, for example the returns of an
individual stock or of some stock index.
The error terms (or residuals) in equation (2.2) are assumed to be inde-
pendently and identically distributed (denoted by i.i.d.). The concept of
independent and identical distribution means the following: First, indepen-
dence guarantees that each error assumes a value that is unaffected by any
of the other errors. So, each error is absolutely unpredictable from knowl-
edge of the other errors. Second, the distributions of all errors are the same.
Consequently, for each pair (x,y), an error or residual term assumes some
value independently of the other residuals in a fashion common to all the
other errors, under equivalent circumstances. The i.i.d. assumption is impor-
tant if we want to claim that all information is contained in equation (2.1)
and deviations from equation (2.1) are purely random. In other words, the
residuals are statistical noise such that they cannot be predicted from other
quantities. If the errors do not seem to comply with the i.i.d. requirement,
then something would appear to be wrong with the model. Moreover, in
that case, a lot of estimation results would be faulty.
The distribution identical to all residuals is assumed to have zero mean
and constant variance, such that the mean and variance of y conditional on
x are, respectively,


μαyx| ==fx() +βx

(^) (2.4)
σσyx^22 | = e
In words, once a value of x is given, we assume that, on average, y
will be exactly equal to the functional relationship. The only variation
in equation (2.4) stems from the residual term. This is demonstrated in
Figure 2.1. We can see the ideal line given by the linear function. Addition-
ally, the disturbance terms are shown taking on values along the dash-
dotted lines for each pair x and y. For each value of x, ε has the mean of
its distribution located on the line α + β ∙ x above x. This means that, on
average, the error term will have no influence on the value of y, yf= ()x
where the bar above a term denotes the average. The x is either exogenous
and, hence, known such that fx()=fx() or x is some endogenous variable
and thus fx() is the expected value of f(x).^1
(^1) Exogenous and endogenous variables are classified relative to a specific causal
model. In regression analysis, a variable is said to be endogenous when it is corre-
lated with the error term. An exogenous variable is a variable whose value is deter-
mined by states of other variables.

Free download pdf