estimated to have emerged only 1.6 ± 0.5 mil-
lion years ago, and its short branch lengths
and low genetic diversity are consistent with
the possibility that BV2 is recovering from a
recent bottleneck (fig. S2). In the time tree,
BV3 diverged independently from other line-
ages of agrobacteria (Fig. 1). However, its rela-
tionships in the time tree are incongruent
with those of the phylogenetic tree (fig. S1).
Plasmid identities
We next addressed the difficult task of re-
solving the relationships of the oncogenic
plasmids. Because plasmid genes do not meet
the assumptions of traditional phylogenetic
methods, we used multiple approaches to com-
pare and cross-validate findings. A total of
143 oncogenic plasmid sequences were iden-
tified. We analyzed different levels of genetic
features, including signatures of subsequences
of lengthk(k-mers), a 43-core gene phylogeny,
a phylogenetic network derived from 144
single-copy genes present in at least 40% of
all plasmids, and patterns of composition as
well as organization. To our surprise, the re-
sults were consistent and allowed us to cat-
egorize the molecules into just nine distinct
lineages of six Ti (type I–VI) and three Ri
(type I–III) oncogenic plasmids (Figs. 2 and 3,
figs. S3 to S9, and data S6 and S7). Type I and
type IV Ti plasmids were further divided into
two and three subtypes, respectively.
In thek-mer network, most types segre-
gated into distinct graphs (Fig. 2A). However,
type IV.c Ti plasmids are high-degree nodes
that connect type I Ti plasmids to the other
subtypes of type IV Ti plasmids. In the phylo-
genetic network, there were many splits, yet
plasmids clustered via short edges into mono-
phyletic clades (Fig. 2B). This topology is
consistent with extensive and ancient recom-
bination events prior to divergence of plas-
mid lineages. The two networks show that
the type I Ti plasmids are most closely related
to type IV plasmids, which is consistent with
their similarities in structure and gene com-
position (fig. S10 and data S8). Other relation-
ships were also observed in the phylogenetic
network. Type I Ti plasmids are related to the
type VI Ti plasmid (Fig. 2B). Type II plasmids
are nested within the type III Ti plasmids and
are more closely related to each other than
to other types of Ti plasmids. Type V Ti plas-
mids are the most distinct of the sequenced
Ti plasmids. Type II and type III Ri plasmids
are more closely related to each other than to
type I Ri plasmids.
Within this sequenced dataset, there are
relationships among plasmids, bacterial spe-
cies, and plant hosts, some of which had not
been previously noted and could potentially
be of use in biotechnology applications of
agrobacteria (Fig. 2C and fig. S3). BV1 has the
most diverse spectrum of oncogenic plasmids,
with four types of Ti plasmids and type I Ri
plasmids. Type III Ti plasmids are exclusively
in BV1. BV2 tends toward having only one of
the two minor variants of type 1.a Ti plasmids
(fig. S15A). However, some have type II and
type IV.c Ti plasmids as well as type III Ri
plasmids. BV3 strains exclusively carry type
IV.a, type IV.b, and type V Ti plasmids. Strains
carrying type I Ti plasmids were isolated pre-
dominately from woody plants, whereas those
with type III Ti plasmids were exclusively from
herbaceous plants. With the exception of only
two strains, those belonging to BV2 were cul-
tured from woody plants. BV3 strains were
exclusively cultured from grapevine, which is
also host to BV1 and BV2 strains.
There are different degrees of variation with-
in each type of oncogenic plasmid. We com-
bined gene synteny, gene annotation, and
sequence data for a comprehensive compar-
ison of patterns of variation in plasmid types
(Fig. 3A, figs. S5 to S9, and data S9). Type I.a
and type II Ti plasmids are relatively con-
served in gene composition and structure.
Type I.b and type III Ti plasmids are more
diverse, with gene presence/absence variation
present in T-DNAs, opine-associated loci, and
regions flanking thevirloci. Despite the lower
sampling depth, a range in gene presence/
absence was also observed among the Ri plas-
mids.AcrossallbuttypeIIandtypeVTiplas-
mids, higher variation in gene composition is
present within T-DNAs and proximal to the
right border. Variation extends to the region
neighboring the right border of T-DNAs in
type I.b and type III Ti plasmids and all types
Weisberget al.,Science 368 , eaba5256 (2020) 5 June 2020 2of8
Fig. 1. Time-calibrated phylogenetic tree of the agrobacteria-rhizobia complex.Blue horizontal bars
indicate confidence intervals for each split. Clades I to IV are defined in fig. S1. Key groups are labeled;
three-letter codes are for select species of rhizobia or agrobacteria without biovar classifications:Run
(R. undicola),Ala(A. larrymoorei),Ask(A. skierniewicense),Aru(A. rubi), andAar(A. arsenijevicii); synonyms
are listed in data S1. Each strain is colored according to its species-level classification. Two groups
(black bars) within BV1 are not associated to established genomospecies.
RESEARCH | RESEARCH ARTICLE