Telling the Evolutionary Time: Molecular Clocks and the Fossil Record

(Grace) #1
Nucleotide models

The Jukes-Cantor (JC69) model was the first nucleotide substitution model proposed and
is perhaps the simplest matrix-based model of distance correction (Jukes and Cantor 1969).
It assumes that the four bases have equal frequencies and that all substitutions are equally
likely. This model was extended to incorporate differing substitution rates between
transitions and transversions (K2P) (Kimura 1980) and also to allow varying base frequencies
(Felsenstein 1981). These two models were combined in the HKY85 model (Hasegawa et
al. 1985). The general time-reversible (GTR) model further extended these models to allow
all six pairs of substitutions to have differing rates (Rodríguez et al. 1990). The LogDet
transformation (Lockhart et al. 1994) is a para-linear process and was designed to circumvent
the problem of variable base composition among sequences. This often provides a better fit
to data when sequences have differing base compositions, but is disadvantageous in that
LogDet-based models cannot be compared with other models for goodness of fit using
likelihood ratio testing (LRT). Furthermore LogDet methods provide branch length
estimates which are not interpretable as distances. ML models such as T92+GC (Galtier
and Gouy 1998) and N2 (Yang and Roberts 1995) do not assume fixed nucleotide
composition values for each taxon but allow them to vary from branch to branch.


Amino acid models

In theory, amino acid-based substitution models should give more satisfactory results when
estimating deep divergence times (e.g. Cambrian) than nucleotide substitution models since
amino acids evolve more slowly than nucleotides. Additionally, because there are some
twenty amino acids in comparison with four nucleotides, saturation is less likely to be
encountered. However, amino acid substitution matrices are generally considered to be
much less sophisticated than their nucleotide counterparts, as less is known about the factors
involved in amino acid substitution (i.e. secondary and tertiary structure and physico-
chemical constraints). An amino acid substitution model analogous to the GTR model in its
time reversibility is the Dayhoff model (Dayhoff 1978). Amino acid replacement is modelled
here with a rate matrix based on the physico-chemical differences of different amino acid
groups. The Dayhoff model has been extended many times (Jones et al. 1992; Muller and
Vingron 2000). Recent work by Lió and Goldman (2002) has provided a model which takes
specified secondary structure information into account when creating rate matrices based
on mitochondrial amino acid sequence data.


Codon models

Although amino acid substitution models have provided an improvement over nucleotide
models when determining deep divergences, they are less realistic in that evolutionary
change occurs at the level of DNA sequences, rather than on amino acids. Models of
evolution based on codons (61 states) rather than amino acids (20 states) are more
representative of reality (Goldman and Yang 1994; Muse and Gaut 1994). These are
computationally expensive, though becoming increasingly sophisticated. For example,
Pedersen et al. (1998) designed a model which takes depressed CpG levels into account.


56 RICHARD A.FORTEY ET AL.


http://www.ebook3000.com

http://www.ebook3000.com - Telling the Evolutionary Time: Molecular Clocks and the Fossil Record - free download pdf - issuhub">
Free download pdf