The correlation is therefore
Conditions under Which the Maximum Likelihood
Is Equivalent to Minimizing Sum of Squares
The substitution of the logarithm of the likelihood criterion with the sum of
the squared errors hinges on a key assumption. The assumption is that the
errors follow a normal distribution with a zero mean. Based on this as-
sumption, every error value may be assigned a probability of occurrence.
We now make another assumption that the errors are independent of each
other. Then the probability (likelihood) of obtaining the following error se-
quence is the product of these probabilities.
Now, taking the logarithm of the above equation on both sides we have an
expression for logarithm of the likelihood.
Let us examine our motivation for doing that. If we arrange a sequence of
numbers in ascending or descending order and take their logarithms in se-
quence, the logarithms are guaranteed to be in ascending or descending
order, as the case may be. We might say that transforming a set of numbers
into their logarithms preserves their ranks. Therefore, maximizing the log-
likelihood is equivalent to maximizing the likelihood. We shall see in the fol-
lowing discussion that the log-likelihood can be simpler to deal with.
log()likelihood = log ( )[]= log
−
()−
==
∑∑pe
N
i e
i
N
i
i
N
1
2
2
(^21)
2
1
2
π
σ
ppei
i
N
()error = ()
=
∏
1
pe
e
i
i
()=−
()
1
2 2
2
π
σ
exp
/
corryy
yy
y
y
y
tt
tt
t
t
t
,
cov ,
var
var
var
−
−
()=
()
()
=
()
()
1 1 =
α
α
Time Series 35