reSeArcH Article
(i) 50 bases at 5’ end mapped to SF3B1(ii) 50 bases at 5’ end mapped to other transcripts
(ii) 2.2%5’ 3’ (i) 97.8%5’ 3’SF3B1 Non SF3B1 GoT targeted loci00.5101000 2000
Pairwise difference of read lengths
in duplicate readsDensitySF3B1 CALR NFE2
---
+--
++-
+++MF05
Single cell colony genotyping (n = 94)aCALR (85%)XBP1 (80%)
SF3B1 (24%)
NFE2 (60%)
JAK2 (7%)
012341000 1500 2000
Distance from gene endMean per 10,000 UMI per cellLinear GoT
Circularization GoT
(%)Avg genotyping efficiencyb1
2
3
4
Log(Number of cases) 1004 1,000 2,000 3,000 ,000 5,000NFE2.E261fs
CALR.K385fs*47
CALR.K367fs*46
XBP1 splice site
SF3B1.H662D
JAK2.V617F (5’)
JAK2.V617F (3’)Distance from closest mRNA end (5’ or 3’)Density5’ end3’ end
Oncogene
Tumor Suppressor gene
PassengerOncogene
Tumor Suppressor genecBRAF V600EKRAS G12D
NRAS Q61K
TP53 R273HLoci targeted
by GoT0255075ATAAGTGTC/GTCTCGC GC-10-8-6- 4 -2 0 246810
Genomic position around the target (SF3B1 c.1894C>G)100 CC TTDetermine read direction (by scanning both ends)Trancript of interest10X
CB10X
UMIGoT
targeted loci
5’ 3’Trim P5, Read#1 primer and PCR#3 handleExtract sequences for cell barcodes, UMI and region of interest(1) Identify reads with expected priming
(2) Identify cell barcodes within whitelist
(3) Replace cell barcodes that are not whithin whitelist
(4) Deduplicate reads
(5) Analyze reads with cell barcodes also present in the 10X scRNA-seq dataNative ONT
adapterONT
BarcodeP5Read#1
primer10X
CB10X
UMINative ONT
adapterONT
BarcodePCR#3
handleGoT
targeted loci
5’ 3’3’ 5’OR Trancript / region of interestheWTMUTWT MUT
Linear GoTCircularization GoT2.8%44.1% 2.5%50.6%CALR00.51
R^2 = 0.92
P < 10-1000 .5 1Mutant fraction per cellCircularization GoTLinear GoTidefg71%5%14%10%Extended Data Fig. 9 | Deciphering subclonal progenitor identities
using multiplex GoT, and targeting loci that are distant from transcript
ends using circularization GoT. a, Single-cell cloning assay of peripheral
blood cells from patient MF05 (Methods). b, Rate of targeted locus
capture (per cent) as a function of gene expression and the distance of the
targeted locus from the transcript ends. c, Distance of the mutation locus
from transcript ends for pan-cancer drivers, and their frequencies (based
on the number of times they are reported in the COSMIC database).
Mutations are annotated as oncogenes, tumour-suppressor genes or
passengers (as previously defined^60 ,^61 ). Relative density of each subclass
of mutations from the closer end (that is, 3′ or 5′) is shown in the top
panel. d, Schematic of analysis of ONT sequencing reads. e, Frequency
of SF3B1-mutant and wild-type reads of linear GoT amplicon library
sequenced with ONT. f, Analysis of SF3B1 amplicon reads sequenced by
ONT for inter-transcript PCR recombination by mapping 50 bp at the
opposite end of the targeted locus, showing only the 2.2% of fragments
that reflect inter-transcript recombination. g, Pairwise difference of
read lengths for duplicate reads (that is, reads with the same cell barcode
and UMI) of the SF3B1 amplicon library sequenced with ONT, showing
consistent read lengths of duplicate reads that support a low rate of intra-
transcript PCR recombination. h, Comparison of genotype assignment
for CALR in sample MF01 between linear GoT and circularization GoT
after downsampling reads to 300,000 with 10 iterations (n = 320 cells).
i, Comparison of CALR-mutant UMI fraction per cell in sample MF01
between linear GoT and circularization GoT after downsampling reads to
300,000 with 10 iterations (n = 320 cells, Pearson’s correlation, F-test).