William Greene 491
To obtain the posterior mean (Bayesian estimator), we assume a non-informative,
flat (improper) prior forθ:
p(θ)∝1.
By Bayes’ theorem, the posterior density would be:
p(θ|d,W)=
∫ p(d|W,θ)p(θ)
θp(d|W,θ)p(θ)dθ
=
∏n
i= 1 [#(w
′
iθ)]
di[ 1 −#(w′
iθ)]
1 −di( 1 )
∫
θ
∏n
i= 1 [#(w
′
iθ)]
di[ 1 −#(w′
iθ)]
1 −di( 1 )dθ,
and the estimator would be the posterior mean:
ˆθBAYESIAN=E[θ|d,W]=
∫
θθ
∏n
i= 1 [#(w
′
iθ)]
di[ 1 −#(w′
iθ)]
1 −didθ
∫
θ
∏n
i= 1 [#(w
′
iθ)]
di[ 1 −#(w′
iθ)]
1 −didθ.
Evaluation of the integrals inθˆBAYESIANis hopelessly complicated, but a solution
using the Gibbs sampler and the technique ofdata augmentation, pioneered by
Albert and Chib (1993), is surprisingly simple. We begin by treating the unobserved
di∗s as unknowns to be estimated, along withθ. Thus, the (K+n)×1 parameter vector
isα=(θ,d∗). We now construct a Gibbs sampler. Consider, first,p(θ|d∗,d,W). If
di∗is known, thendiis known. It follows that:
p(θ|d∗,d,W)=p(θ|d∗,W).
This posterior comes from a linear regression model with normally distributed
disturbances and knownσ^2 =1 (see equation (11.4) above). This is the standard
case for Bayesian analysis of the normal linear model with an uninformative prior
for the slopes and knownσ^2 (see, e.g., Koop, 2003; Greene, 2008a, sec. 18.3.1),
with the additional simplification thatσ^2 =1. It follows that:
p(θ|d∗,d,W)=N[q∗,(W′W)−^1 ],
where:
q∗=(W′W)−^1 W′d∗.
For di∗, ignoring di for the moment, it would follow immediately from
equation (11.4) that:
p(d∗i|θ,W)=N[w′iθ,1].
However,diis informative aboutd∗i.Ifdiequals one, we know thatdi∗>0 and, if
diequals zero, thendi∗<0. The implication is that, conditioned onθ,W, andd,di∗
has a truncated (above or below zero) normal distribution. The standard notation
for this is:
p(d∗i|θ,di=1,wi)=N+[w′iθ,1]
p(d∗i|θ,di=0,wi)=N−[w′iθ,1].
These results set up the components for a Gibbs sampler that we can use to estimate
the posterior means E[θ|d,W] and E[d∗|d,W].