Robert_V._Hogg,_Joseph_W._McKean,_Allen_T._Craig

(Jacob Rumans) #1
356 Maximum Likelihood Methods

As in Chapter 4, our point estimator ofθisθ̂=θ̂(X 1 ,...,Xn), whereθ̂max-
imizes the functionL(θ). We callθ̂the maximum likelihood estimator (mle) ofθ.
In Section 4.1, several motivating examples were given, including the binomial and
normal probability models. Later we give several more examples, but first we offer
a theoretical justification for considering the mle. Letθ 0 denote thetrue valueofθ.
Theorem 6.1.1 shows that the maximum ofL(θ) asymptotically separates the true
model atθ 0 from models atθ =θ 0. To prove this theorem, certain assumptions,
regularity conditions, are required.


Assumptions 6.1.1(Regularity Conditions).Regularity conditions (R0)–(R2) are


(R0)The cdfs are distinct; i.e.,θ =θ′⇒F(xi;θ) =F(xi;θ′).

(R1) The pdfs have common support for allθ.


(R2) The pointθ 0 is an interior point inΩ.


The first assumption states that the parameter identifies the pdf. The second as-
sumption implies that the support ofXidoes not depend onθ. Thisisrestrictive,
and some examples and exercises cover models in which (R1) is not true.


Theorem 6.1.1.Assume thatθ 0 is the true parameter and thatEθ 0 [f(Xi;θ)/f(Xi;θ 0 )]
exists. Under assumptions (R0) and (R1),


lim
n→∞
Pθ 0 [L(θ 0 ,X)>L(θ,X)] = 1, for allθ =θ 0. (6.1.3)

Proof:By taking logs, the inequalityL(θ 0 ,X)>L(θ,X)isequivalentto


1
n

∑n

i=1

log

[
f(Xi;θ)
f(Xi;θ 0 )

]
< 0.

Since the summands are iid with finite expectation and the functionφ(x)=−log(x)
is strictly convex, it follows from the Law of Large Numbers (Theorem 5.1.1) and
Jensen’s inequality (Theorem 1.10.5) that, whenθ 0 is the true parameter,


1
n

∑n

i=1

log

[
f(Xi;θ)
f(Xi;θ 0 )

]
P
→Eθ 0

[
log

f(X 1 ;θ)
f(X 1 ;θ 0 )

]
<logEθ 0

[
f(X 1 ;θ)
f(X 1 ;θ 0 )

]
.

But
Eθ 0

[
f(X 1 ;θ)
f(X 1 ;θ 0 )

]
=


f(x;θ)
f(x;θ 0 )

f(x;θ 0 )dx=1.

Because log 1 = 0, the theorem follows. Note that common support is needed to
obtain the last equalities.

Theorem 6.1.1 says that asymptotically the likelihood function is maximized at
thetruevalueθ 0. So in considering estimates ofθ 0 , it seems natural to consider the
value ofθthat maximizes the likelihood.

Free download pdf