of Ri plasmids. In these plasmids, the re-
gion encodes proteins necessary for trans-
port and catabolism of opines. As previously
noted, noncore genes of most plasmid types
are highly variable. Thetraandtrbloci of
type I–III Ti plasmids have extensive small-
scale changes.
Plasmid evolution
The range in variation across types of onco-
genic plasmids suggests that each of the types
has been shaped by different evolutionary
processes. We modeled plasmid histories to
understand these processes and infer origins
and relationships. In doing so, a constant theme
emerged: Core genes used to characterize
oncogenic plasmids not only have low phylo-
genetic value but are in fact the major con-
tributors to their variation. Their conservation
among plasmids provides regions for recombi-
nation and reshuffling of genes to occur. We
therefore analyzed plasmid types at different
scales to infer patterns of evolution. Features
of oncogenic plasmids were also accounted
for separately because of their different evolu-
tionary histories.
Different levels of modularity and varia-
tion in gene analogs contribute to oncogenic
plasmids maintaining their function while
adopting diverse gene composition and struc-
ture. Thevirlocusiscentraltovirulenceof
agrobacteria yet is prone to exchange among
plasmid types. Results fromk-mer and phylo-
genetic analyses were not consistent with
grouping on the basis of locus structure or with
relationships among plasmid types (Fig. 4A
and figs. S11 and S12A). Thevirloci of type I.a,
type I.b, type II, type III, and type VI Ti plas-
mids as well as type II and type III Ri plasmids
share a recent common ancestor and follow
one of the paths in thek-mer graph. Those of
type IV and type V Ti plasmids and type I Ri
plasmids share a recent common ancestor
and follow the second path.
Thevirlocusisoftenacquiredasseparate
modules. There are twovirloci in type IV.c Ti
plasmids. The primaryvirlocusishypothe-
sized to have been acquired from a Ri plasmid.
By itself, this locus is likely not functional be-
cause of multiple transposase genes interrupt-
ingvirA, which encodes a regulator of thevir
genes (fig. S12B). The secondvirlocus is a rem-
nant that encompasses onlytzs,virA,virB1,
virB2,andvirB3and has overall identity of
93% to thevirlocus of type I Ti plasmids. We
speculate that thevirAallele of this fragment
is necessary to complement the disrupted
allele of the primaryvirlocus. In addition,
virE1-2andGALLSgenes are interchange-
able submodules that are frequently inherited
separately from the rest of thevirlocus. This
was observed forvirE1-2genes among type
IV–VI Ti and type II Ri plasmids (fig. S12).
Likewise, thevirlocus of type III Ri plasmids,
despite having a common ancestor with that
of type I Ti plasmids, carriesGALLSinstead
ofvirE1-2.
T-DNAs are extraordinarily diverse in size
as well as composition and are highly re-
combinogenic (Fig. 4B, fig. S13, and data S6).
T-DNA transfer is largely dependent on a
short right border sequence and a flanking
overdriveenhancer sequence (fig. S14) ( 23 – 25 ).
Border sequences are practically invariant
among 213 T-DNAs, and this strict conser-
vation of short sequences is a low barrier for
generating alternative T-DNA–vircombina-
tions and multiplexing T-DNAs in plasmids
(fig. S14, A and B, and data S6). Diversifica-
tion can be driven by gene gain and loss from
T-DNAs with little to no consequence to the
transformation process. Chimerization is an-
other frequently permitted mechanism of
diversification (figs. S15 and S16). The orig-
inal T-DNA of the type I.a Ti plasmid (e.g.,
pTiC58) was extended, from left to right, by
invasion of two additional T-DNAs. The last
of the T-DNAs is one of two prominent var-
iants that invaded and swept through the
type I–IV Ti plasmids, potentially because of
a selective advantage over others. This is the
path fromacsthrough the6bgene to the right
border that cuts prominently across the gene
synteny graph. The type III Ti plasmid has two
left borders in T-DNA-1 because a second
T-DNA displaced all but the most left-flanking
acsgene of the original T-DNA. T-DNA-2 of
the type VI Ti plasmid has three right border
sequences and two sets of homologous, but
nonparalogous, oncogenes. This T-DNA follows
a complex path in the graph.
However, within this dataset, there is little
evidence for exchange of T-DNAs between
classes of oncogenic plasmids. With the ex-
ception of T-DNA-2 of type III Ri plasmids,
T-DNAs of the Ti and Ri classes have distinct
gene compositions and segregate into differ-
ent regions of the gene synteny graph prior
Weisberget al.,Science 368 , eaba5256 (2020) 5 June 2020 3of8
Fig. 2. Nine distinct lineages of oncogenic plasmid types.(A) Weighted undirected network of oncogenic
plasmids. Nodes represent individual oncogenic plasmids and are colored according to the type and subtype
of Ti (top) and Ri (bottom) plasmids. Darker edges indicate greater Jaccard similarity ofk-mer signatures.
(B) A split network of the oncogenic plasmids. Branch thickness indicates relative support for the split.
Key reference strain plasmids are indicated. (C) Maximum likelihood tree constructed on the basis of
concatenated sequences from 43 single-copy core genes (data S7). Tips (circles and triangles) are color-
coded according to plasmid type. Colored panels in the top row below the tree denote the type of plant from
which strains were cultured. Colored panels in the bottom row indicate the classification of the strains.
The same color scheme for plasmid types is used in each of the three panels. The tree is midpoint-rooted.
See fig. S3 for a more detailed and larger version of (C).
RESEARCH | RESEARCH ARTICLE