364 Maximum Likelihood MethodsRemark 6.2.1.Note that the information is the weighted mean of either
[
∂logf(x;θ)
∂θ] 2
or −∂^2 logf(x;θ)
∂θ^2,where the weights are given by the pdff(x;θ). That is, the greater these derivatives
are on the average, the more information that we get aboutθ. Clearly, if they were
equal to zero [so thatθwould not be in logf(x;θ)], there would be zero information
aboutθ. The important function
∂logf(x;θ)
∂θ
is called thescore function. Recall that it determines the estimating equations
for the mle; that is, the mleθˆsolves
∑ni=1∂logf(xi;θ)
∂θ=0forθ.
Example 6.2.1(Information for a Bernoulli Random Variable).LetXbe Bernoulli
b(1,θ). Thus
logf(x;θ)=xlogθ+(1−x)log(1−θ)
∂logf(x;θ)
∂θ=x
θ−1 −x
1 −θ
∂^2 logf(x;θ)
∂θ^2
= −x
θ^2
−1 −x
(1−θ)^2
.Clearly,
I(θ)=−E[
−X
θ^2−1 −X
(1−θ)^2]=θ
θ^2+1 −θ
(1−θ)^2=1
θ+1
(1−θ)=1
θ(1−θ),which is larger forθvalues close to zero or one.
Example 6.2.2(Information for a Location Family).Consider a random sample
X 1 ,...,Xnsuch that
Xi=θ+ei,i=1,...,n, (6.2.7)
wheree 1 ,e 2 ,...,enare iid with common pdff(x) and with support (−∞,∞). Then
the common pdf ofXiisfX(x;θ)=f(x−θ). We call model (6.2.7) alocation
model. Assume thatf(x) satisfies the regularity conditions. Then the information
is
I(θ)=∫∞−∞(
f′(x−θ)
f(x−θ)) 2
f(x−θ)dx=∫∞−∞(
f′(z)
f(z)) 2
f(z)dz, (6.2.8)