A STATISTICS PRIMER A–13
We compare the two likelihoods, and then use a probabilistic rule (see https://
en.wikipedia.org/wiki/Metropolis_Hastings_algorithm) that tells us whether to
keep the first value for the node’s age and discard the second, or to do the reverse.
We record the age that is retained, then repeat the process. After thousands (or
even millions) of repetitions, the distribution of ages that we retained will very
closely resemble the posterior probability distribution for the node’s age. We can
use the distribution of retained values to estimate the node’s age (the age that has
the greatest probability) and the confidence interval for that estimate.
This method is one of several that collectively are called Markov Chain Monte
Carlo, often abbreviated as MCMC. In this example, the aim is to estimate a single
quantity (the age of a node). In practice, the method is typically used to do much
more ambitious jobs, such as simultaneously estimating the branching pattern of
the phylogeny, the ages of all its nodes, and the rates of sequence evolution.
Futuyma Kirkpatrick Evolution, 4e
Sinauer Associates
Troutt Visual Services
Evolution4e_A.13.ai Date 01-08-2017 03-01-2017
(A)
(B)
(C)
Prior Likelihood
0.2 0.4 0.6 0.8
Value of p 1
Posterior
Prior Likelihood
Probability density of
p^1
Posterior
Likelihood
Prior
Posterior
FIGURE A.13 Bayesian estimates for the fre-
quency of allele A 1 in a second population of
platypuses. The likelihood function from the
first population (Figure A.12) is used for the
prior distribution. The actual frequency in the
second population is p 1 = 0.4 (the red circle).
(A) A sample of just four alleles from the sec-
ond population has one copy of A 1 and three
copies of A 2. The resulting likelihood function
is quite flat. The posterior distribution (equal to
the product of the prior distribution and the
likelihood function) is very similar to the prior
distribution. The peak in the posterior distribu-
tion, is p 1 = 0.21 (the black diamond). (B) With
a sample of 20 alleles, we have 8 copies of A 1
and 12 copies of A 2. The likelihood function is
more strongly peaked because of the larger
sample size. The posterior distribution now
estimates that the frequency of A 1 is p 1 = 0.3.
(C) With a sample of 100 alleles, we have 37
copies of A 1 and 63 copies of A 2. The poste-
rior distribution is now even more strongly
peaked, and nearly centered on the true allele
frequency of p 1 = 0.4. Our estimate for the
frequency of A 1 is now p 1 = 0.34.
23_EVOL4E_APP.indd 13 3/22/17 1:52 PM