Nature | Vol 584 | 20 August 2020 | E21Non-PCR-based hybrid-capture sequencingRead mapping...
... Clipped backbone sequenceAPP vector
Reference
genome...
...
...
...
PCR-based assays: cloning, SMRT-seqSource APP
(sample DNA)APP insert
(vector contaminant)No PCR product
... owing to large gene size...
APP(sample DNA) retrocopy ...
5 ′ UTR 3 ′ UTRAAAAIndistinguishablePCR primers
targeting coding exonsNeuN+ neuronal nuclei
APPcoding exons only insert with...
......
...APP coding sequence end APP coding sequence startClipped vector
backbone sequenceReference
genome
(GRCh38)APP vector contaminationSource APPAPP insert
(vector contaminant)APP retrocopy ... AAA
Source APP sitePCR
exon
intron...
...
......
...
...Paired-end
sequence reads... AAA Clipped retrocopy
target site sequence5 ′ UTR 3 ′ UTRVector (pGEM-T Easy)
backbone sequenceCGCGAATTCACTAGTGAT
GCGCTTAAGTGATCACT A3 ′ T overhang
T
A APP insertAATCGAATTCCCGCGGCCG
TTAGCTTAAGGGCGCCGGC3 ′ T overhang
APP insertab3 ′UTR 5 ′UTRAAAAGGCTGCTGTGGCGGGGGTCTAGTTCTGCATCTGCTC********** NNNNNNNNNN QQQQQQQQQQ MMMMMMMMMM QQQQQQQQQQ EEEEEEEEEECCGGCCGGAAAATTTTCCAACCTTAAGGTTGGAATT
CCGGCCGGAAAATTTTCCAACCTTAAGGTTGGAATT
CCGGCCGGAAAATTTTCCAACCTTAAGGTTGGAATT
CCGGCCGGAAAATTTTCCAACCTTAAGGTTGGAATT
CCGGCCGGAAAATTTTCCAACCTTAAGGTTGGAATT
CCGGCCGGAAAATTTTCCAACCTTAAGGTTGGAATT
CCGGCCGGAAAATTTTCCAACCTTAAGGTTGGATT
CCGGCCGGAAAATTTTCCAACCTTAAGGTTGGAATT
C G TTGCCAAACCGGGCAGCATCGCGACCCTGCGCGGGGCA
G AAAATTCCGGAAAATTTTCCCCCCGGCCGGGGCCCCGG
AAAATTCCGGAAAATTTTCCCCCCGGCCGGGGCCCCGG
AAAATTCCGGAAAATTTTCCCCCCGGCCGGGGCCCCGG
AAAATTCCGGAAACTTTTCCCCCCTGCCGGGTCCCCGG
AAAATTCCGGAAAATTTTCCCCCCGGCCGGGGCCCCGG
AAAATTCCGGAAAATTTTCCCCCCCCGCGGGCCCCCGG
AAAATTCCGGAAAATTTTCCCCCCGGCCGGGGCCCCGG
CC AATCGAATTCCCGCGGCCGAAAAAAAA LLLLLLLL GGGGGGGG PPPPPPPP LMLMLMLMLMLMLML M2,000
1,000
650
400
2001–181–18N2–17APP-751 APP-695Restriction sites BstZINotIEcoRI SpeI EcoRI SacII BstZINotIc25,881,660 bp 25,881,670 bp 25,881,680 bp 26,170,610 bp 26,170,620 bp 26,170,630 bp05
Clipped read fraction (%)123456791011 1213 1415161718
Exons101520Estimated by APP vector clipped sequencesExpected fraction estimated frExpected fraction estimated from the Lee study DISH experiment om the Lee study DISH experiment1–181–18N2–17 Sequence homology between two junctionsR6/17CA
CA
Exon2 CA...CC Exon17
AT...R2/17R6/18R3/14R3/17AGCCAAC
AGCCAAC
AGCCAAC...GA
AC...Exon14
Exon3GCAGTG
GCAGTG
GCGGTG...AA
TT...Exon17
Exon3GAGGA
GAGGA
GAGGA...AC
GC...Exon18
Exon6AGATGGGAGTGAAGACAAAG
AGATGGGAGTGAAGACAAAG
AGATGTGGGTTCAAACAAAG...GC
GT...Exon17
Exon6R2/16 AT
AT
AT...GT
GC...Exon16
Exon2R2/14 ACCAAGGA
ACCAAGGA
ACCAAGGA...AT
TC...Exon14
Exon2R1/14 GCTC
GCTC
GCTC...CG
CT...Exon14
Exon1Fig. 1 | APP vector contamination in the Lee study. a, APP vector contamination
and its manifestation in genome sequences. PCR-based assays in the Lee study^2
fail to distinguish between APP retrocopy and vector APP insert. Hybrid-capture
sequences from the Lee study show clipped reads with a vector backbone
sequence (pGEM-T Easy), including restriction sites at the multiple cloning site
and a 3′ T-overhang. b, Estimated fractions of cells with APP gencDNA at the exon
junctions in the Lee hybrid-capture data. All exon junction fractions (black dots)
are comparable to the fraction at the coding sequence ends with vector
backbone sequences (red dots). The dotted line above represents the
conservative estimate of expected fraction based on the Lee DISH experiment
(see Supplementary Methods); shaded area, 95% confidence interval.
c, Electrophoresis and sequencing of PCR products from the vector APP inserts
(APP-751/695) showing new APP variants as artefacts. Eight out of twelve IEJs
found both in our APP vector PCR sequencing and the Lee study RT–PCR results
are shown (Extended Data Fig. 3).