Nature - USA (2019-07-18)

(Antfer) #1

Article reSeArcH


(that is, the triple mutant) did not further increase cell-cycle activation.
Thus, multiplexed GoT demonstrates the ability to examine complex
clonal structures, as well as the need to assess the combinatorial tran-
scriptional output of mutations in the context of the high-resolution
mapping of cell identity.


Circularization GoT targets distant loci
GoT amplicon recovery is not only dependent on the expression level of
the gene but also on the distance of the mutation locus from transcript
ends (Extended Data Fig. 9b). Capture efficiency of a mutation that is
distant from the 3′ end (>1.5 kb) (for example, SF3B1 genotyping of


9% of cells, see Fig. 5a) was lower than for targets closer to the 3′ end.
Although driver mutations are often found within 1.5 kb of one of
the transcript ends (Extended Data Fig. 9c), loci of interest may reside
at larger distances—and thus the dependency on relative proximity
to transcript ends is limiting. We reasoned that the lower genotyping
efficiency resulted at least in part from the inability of larger amplicon
fragments to cluster efficiently on Illumina flow cells during sequenc-
ing. We further integrated our protocol with long-read sequencing
using nanopore GridION X5 (Oxford Nanopore Technology), which
demonstrated that SF3B1 transcripts were captured accurately with our
procedure (Extended Data Fig. 9d, e) and further confirmed low intra-
and inter-transcript PCR recombination rate, even for these relatively
large fragments (Extended Data Fig. 9f, g).
To overcome the limitation of amplicon fragment length, we applied
sequential rounds of circularization and inverse PCR to remove the
intervening sequence between the region of interest and the cell
barcode, resulting in a fragment length compatible with short-read
sequencing (Fig. 5c). Circularization GoT showed a high concordance
with un-circularized GoT (that is, the standard linear GoT technique)
for CALR genotyping (Extended Data Fig. 9h, i). When applied to the
capture of SF3B1 mutations, circularization GoT markedly increased
the yield of genotyped cells from 750 to 2,004 cells (9% to 24% of cells)
(Fig. 5d, e). These results demonstrate the ability of circularization GoT
to extend our reach to targets that are distant from gene ends.
To further demonstrate the ability of circularization GoT to genotype
efficiently even when mutations are at a considerable distance from a
transcript end, we targeted JAK2V617F (which is located about 2. 3  kb
from the closer transcript end). We first validated circularization GoT
for JAK2 via a mixing experiment using barcoded cDNA from the TF1
cell line (wild-type JAK2) and from HEL cells (homozygous JAK2V617F),
which showed accurate genotype assignment (Fig. 5f). Next, we geno-
typed primary CD34+ cells from an individual with JAK2V617F essential
thrombocythaemia, and obtained genotyping information for 7.3% of
cells (Fig. 5g) even for this gene, which is expressed at very low levels.
Mutant-cell frequency was higher in MkPs than in HSPCs, whereas the
mutant-cell frequency remained low in erythroid progenitors (Fig. 5h).
This is concordant with the clinical phenotype of essential thrombo-
cythaemia rather than polycythaemia vera (a disease that is associated
with the same JAK2 mutation, but which is characterized by erythro-
cytosis as the leading abnormality). Consistent with this observation,
we observed a trend towards increased MkP priming in mutant HSPCs
(P = 0.04, Wilcoxon rank-sum) (Fig. 5i, Supplementary Table 2)—
albeit in a small number of genotyped HSPCs. These data suggest a
skewing of differentiation towards megakaryopoiesis in HSPCs and
may provide insights into the isolated megakaryocytic proliferation in
JAK2-mutated essential thrombocythaemia.

Discussion
Here we present GoT, which captures both somatic genotypes and tran-
scriptomic identities in thousands of single cells from primary cancer
specimens. Building on previous experience with targeted amplifi-
cation in droplet-based scRNA-seq^44 ,^45 , GoT overcomes the unique
set of challenges presented by the genotyping of somatic mutations,
including lower expression levels and large distances from the end of
the sequenced transcripts.
GoT allowed us to directly investigate the transcriptional effect of
CALR mutations in primary samples of myeloproliferative neoplasm,
in which wild-type cells in the sample provide an ideal comparison set
that controls for patient-specific and technical confounders. In essential
thrombocythaemia, we observed that mutant CALR provided a greater
fitness advantage through differentiation, which was associated with
higher proliferation in committed myeloid progenitors than in uncom-
mitted HSPCs. The ability of GoT to finely map transcriptional differ-
ences between mutant and wild-type HSPCs revealed an upregulation
of NF-κB pathway genes in the most-undifferentiated mutant HSPCs,
which supports a cell-intrinsic role for CALR mutation in NF-κB acti-
vation^46. We further applied GoT to target the unconventional splice

aMF05
WT
SF3B1
CALR
NFE2

Time

Proportion

VAF = 47.5%

VAF = 33%

VAF = 43.5%

Barcode
JAK2 cDNA

V617F

TF1 HEL

cDNA mixing study
for circularization GoT

0

100

f
~3 kb
3 ′ 5 ′

01

Mutant transcript
fraction per cell
d

0

1

2

HSPC
EPMkP

Normalized mutant-

cell frequency

P <10–10
P = 0.22

log^345
2 (EP module)

Density

P = 0.15

4.55.5 6.5
log 2 (MkP module)

Density

P = 0.044

g


MEP

log

(cell- 2
cycle module) 4

5

6

7

P = 0.004

b

1 Non-fragmented 3′ cDNA library
2 Hemi-nested gene-specic PCR (no. 1)
3 Cloning-compatible ends
4 Intramolecular ligation & inverse PCR (no. 2)

6 Intramolecular ligation and inverse PCR (no. 3)
7 Libraries ready for indexing
and short-read sequencing

* $$$$$$$$$

* $$$$$$$$$

* $$$$$$$$$

* $$$$$$$$$

$$$$
* $$$$

* $$$$$$$$$

$$$$
$$$$
* $

5 Cloning-compatible ends

No. of cells = 8,475
HSPC
IMP
NP
MEP

MkP

EP E/B/M
SF3B1
CALR
NFE2

+







+
+
+

+
+





0

1

4

16

64

UMI counts per cell
10x10x
Linear
SF3B1
gene

Ta rgeted
SF3B1 locus

CircONT
GoT

*

TSO handle
Left cloning handle
Barcode + UMI

Region of interecDNA st R1 handle Right cloning handle
P5 R2 sequence (with RPI-x handle)

MUTWT +–

c


ET09 JAK2V617F
No. of cells = 3,990

WT (n = 97)
MUT (n = 193)
NA (n = 3,700)

TF1 HEL
WT
V617F

98.4%^ 0.1%
1.6%^ 99.9%
hi
WT

HSPC
n = 17
n = 15MUT

P = 0.14

e

t-SNE2
t-SNE1
SF3B1

n = 750 n = 7,725

t-SNE2
t-SNE1

CALR

n = 6,263 n = 2,212

t-SNE2
t-SNE1

NFE2

n = 5,037 n = 3,438

t-SNE2
t-SNE1

SF3B1

t-SNE2
t-SNE1

HSPC

IMP

NP

EPMEP

MkP

M/D
PreB

t-SNE2
t-SNE1

t-SNE2
t-SNE1

n = 2,004 n = 6,471

Reads (%)

Genotyped
Not genotyped

Fig. 5 | GoT dissects subclonal identity through multiplexing and
targets loci that are distant from transcript ends via circularization.
a, Schematic of clonal evolution of neoplastic cells from sample MF05
(top left). t-SNE projections of CD34+ cells with cluster assignments
(top right) and with GoT data for each variant (bottom). b, Cell-cycle
score in subclonal megakaryocytic–erythroid progenitor populations
(n = 28 single-mutant, 109 double-mutant and 293 triple-mutant cells).
c, Schematic of circularization GoT. d, t-SNE projection of GoT data
for SF3B1 CD34+ cells from sample MF05, from circularization GoT
and linear GoT. e, UMIs per cell for SF3B1 gene (blue shade) or targeted
SF3B1 locus (pink shade) from 10x, linear GoT sequenced on Illumina,
circularization (circ) GoT and linear GoT sequenced with Oxford
Nanopore Technology (ONT) (n = 8,475 cells). f, Mixing study with
human JAK2 wild-type cDNA from the TF-1 cell line and homozygous
JAK2V617F cDNA from the HEL cell line. Frequency of reads (wild type,
V617F or not assignable) assigned to TF-1 or HEL cell barcodes. g, t-
SNE projection of CD34+ cells from a patient with JAK2V617F essential
thrombocythaemia, showing cluster assignment (left) and genotyping
information (right) based on GoT data. h, Normalized frequency of
mutant cells within the progenitor clusters (Methods). Mean ± s.d. of
n = 100 downsampling iterations. i, Density plots of HSPCs along lineage-
priming modules (n = 17 wild-type versus 15 mutant cells). P values for b,
h, i are from a two-sided Wilcoxon rank-sum test.


18 JUlY 2019 | VOl 571 | NAtUre | 359
Free download pdf