John DiNardo 12105100 .2 .4 .6 .8 1
θ
α = 1, δ = 1 α = 0.5, δ = 0.5
α = 2, δ = 10 α =10, δ = 2
α = 15, δ = 15 α = 95, δ = 95Figure 3.1 Different priors using the Beta distribution
It is sometimes helpful to think ofα−1 as the number of heads “previously”
observed,δ−1 the number of tails, andα+δ−2 as the total number of coin
flips previously observed from the experiment. On the other hand, it is not clear
how someone could verify that a particular choice of prior was a good or bad
description of one’s beliefs.- In the third step, we merely plug our prior and our likelihood into Bayes’ rule
 and what we come up with is^33
1
B((a+h),(δ+(N−h)))
θα+h−^1 ( 1 −θ)δ−^1 +N. (3.8)Given the usual caveats, equation (3.8) is a statement of your personal beliefs
about the value ofθ, modified in light of the observed coin-toss. The beta distribu-
tion is a nice example because it is easier than usual to characterize the resulting
“beliefs.”
The left panel of Figure 3.2 shows two different prior distributions – one labeled
“less informative” and the other “very informative.” The first prior distribution
corresponds to Beta(199,1) and the second to Beta(2,2). A convenient fiction to
appreciate these prior beliefs is to imagine that, in the first case, you have previously
observed 200 observations, 199 of which were heads. In the second case, you have
previously observed four observations, two of which were heads. The first case
corresponds to having “more prior information” than the second.
