PHylogENy: THE UNITy ANd dIvERSITy of lIfE 413
lutionary pattern as the first, and so it also lends statistical
support to tree A.But the third base is different: because
of parallel evolution from A to G in the recent ancestors of
species 2 and species 3, this base suggests that tree B is
the most likely.
After combining the data from all three sites, we find
that the likelihood is greatest for tree A, with t 1 = 3.4 million
years and t 2 = 0.6 million years (FIGURE 16.A3). Thus the
estimate for the age of the MRCA of species 1 and species
2 is t 2 = 0.6 Mya, and the MRCA of all three species (that is,
the root of the tree) is t 1 + t 2 = 4.0 Mya.
This example is, of course, tremendously simplified. In
practice, we would typically have data from many DNA
bases. We would also make more realistic (but complicat-
ed) assumptions about evolution (for example, that rates of
substitution differ among the four DNA bases, and among
different sites in the genome), and we would also use the
data to estimate those rates (rather than assuming their
values). Last, rather than assuming what the bases were
in the ancestor, we would account for uncertainty in their
states, for example by averaging over the probabilities that
the ancestor had any one of the four bases at each of the
three sites.
But even in those more complex settings, the basic
approach is the same: we make assumptions about how
evolution works, derive a function that gives the prob-
ability of our data given any specific phylogeny, and finally
determine which phylogeny is most likely. Comparing
these results with those we obtained from parsimony in
the main text, we see some of the advantages and disad-
vantages of the two approaches. Among them, likelihood
is able to estimate the ages of nodes in the phylogeny,
but it requires us to make explicit assumptions about the
evolutionary process and to carry out some moderately
complicated calculations.
Futuyma Kirkpatrick Evolution, 4e
Sinauer Associates
Troutt Visual Services
Evolution4e_Box16.A2.ai Date 01-03-2017
1 2
Maximum
Increasingly
likely
3 4
0
1
2
t 1
t 2
FIGURE 16.A2 Contour plot of the likelihood for different
branch lengths (t 1 and t 2 ) of tree A, based on Equation 16.A1
and using a substitution rate of λ = 0.3/million years. Com-
binations of t 1 and t 2 values in the darker regions of the plot
yield the lowest likelihoods; those in the lightest region yield
the highest likelihoods. The trees in the four corners show
the tree shapes for the corresponding values of t 1 and t 2. The
maximum value of the likelihood occurs when t 1 = 2.7 million
years and t 2 = 0 (indicated by the red circle).
FIGURE 16.A3 Using all the data in Figure 16.12, maximum
likelihood estimates that species 1 and species 2 are most
closely related, that their MRCA lived 0.6 Mya, and that the
MRCA of all three species lived 4 Mya.
Futuyma Kirkpatrick Evolution, 4e
Sinauer Associates
Troutt Visual Services
Evolution4e_Box16.A3.ai Date 02-13-2017
t 2
t 1
2 4 6 8
0
1.5
1
0.5
Sp 1
0.6 Mya
4.0 Mya
Sp 2 Sp 3
BOX 16A
Estimating Trees with Likelihood (continued)
16_EVOL4E_CH16.indd 413 3/22/17 1:33 PM