and stochastic errors (Hillis et al. 1996; Sanderson and Doyle 2001) and others with an
inability to infer rate changes correctly over the tree (Sanderson1997, 1998).
Furthermore, until recently they all relied on the assumption that sequences evolve
roughly at constant rates by enforcing a rigorous clock assumption on our analyses.
Sanderson (1997) developed a different approach, the non-parametric rate smoothing
(NPRS), which allows rates to change over the tree but assumes that such changes are
autocorrelated, effectively meaning that changes in rates are assumed to be inherited from
an ancestral lineage by its immediate descendants. The method smooths local changes in
rates over the tree using an optimization algorithm, and searches for the solution that
minimizes the inferred rate changes (Sanderson 1997). As a case study, we have used
NPRS to estimate divergence times in angiosperms using a three-gene dataset based on
plastid rbcL and atpB exons and nuclear 18S rDNA covering 560 angiosperm taxa (Soltis et
al. 1999, 2000). Our primary aim was to provide an initial hypothesis of angiosperm
divergence times based on sequence divergence data representing a majority of extant
families, and by using an internal calibration point, an independent evaluation of
angiosperm and eudicot origins is accomplished. Results are compared with fossil-based
estimates extracted from Magallón and Sanderson (2001) for magnoliids and from
Magallón et al. (1999) for eudicots; possible directions for future analyses are discussed.
Material and methods
Data
Nucleotide sequence data covering a majority of all flowering plant families have over the
last decade been assembled for three different loci, rbcL (Chase et al. 1993, 2000;
Savolainen et al. 2000b) and atpb (Savolainen et al. 2000a) from the plastid genome, and
18S rDNA (Soltis et al. 1997) from the nucleus. These efforts recently culminated in a
comprehensive phylogenetic analysis including 560 angiosperms and seven out-group
gymnosperm taxa (Soltis et al. 1999, 2000). In total, these analyses included
representatives of about 75 per cent of all families recognized in the most up to date
classification (APG 1998). We used their complete dataset to calculate branch lengths on
one of the more than 8000 most parsimonious trees obtained by Soltis et al. (1999, 2000);
the tree used corresponds to that reported in their ‘B series’ of figures. The seven
outgroup taxa used by Soltis et al. (1999, 2000) were initially included to obtain branch
length estimates for the first ingroup branching point but were subsequently removed
from the analyses. Branch lengths were estimated using both parsimony (accelerated and
delayed transformations; ACCTRAN and DELTRAN, respectively), and maximum
likelihood methods. The HKY85 model of sequence evolution (Hasegawa et al. 1985) was
used in the likelihood estimates, and transition/transversion ratios as well as nucleotide
frequencies were estimated from the data. Calculations were done using PAUP 4.0b4a
(Swofford 1998).
148 ANGIOSPERM DIVERGENCE