Nature - USA (2019-07-18)

(Antfer) #1

Letter
https://doi.org/10.1038/s41586-019-1369-y


scSLAM-seq reveals core features of transcription


dynamics in single cells


Florian erhard1,6, Marisa A. P. Baptista1,6, tobias Krammer2,6, thomas Hennig^1 , Marius Lange3,4, Panagiota Arampatzi^5 ,
Christopher S. Jürges^1 , Fabian J. theis3,4, Antoine-emmanuel Saliba^2
& Lars Dölken1,2*


Single-cell RNA sequencing (scRNA-seq) has highlighted the
important role of intercellular heterogeneity in phenotype
variability in both health and disease^1. However, current scRNA-seq
approaches provide only a snapshot of gene expression and convey
little information on the true temporal dynamics and stochastic
nature of transcription. A further key limitation of scRNA-seq
analysis is that the RNA profile of each individual cell can be
analysed only once. Here we introduce single-cell, thiol-(SH)-linked
alkylation of RNA for metabolic labelling sequencing (scSLAM-seq),
which integrates metabolic RNA labelling^2 , biochemical nucleoside
conversion^3 and scRNA-seq to record transcriptional activity
directly by differentiating between new and old RNA for thousands
of genes per single cell. We use scSLAM-seq to study the onset of
infection with lytic cytomegalovirus in single mouse fibroblasts. The
cell-cycle state and dose of infection deduced from old RNA enable
dose–response analysis based on new RNA. scSLAM-seq thereby
both visualizes and explains differences in transcriptional activity
at the single-cell level. Furthermore, it depicts ‘on–off’ switches and
transcriptional burst kinetics in host gene expression with extensive
gene-specific differences that correlate with promoter-intrinsic
features (TBP–TATA-box interactions and DNA methylation).
Thus, gene-specific, and not cell-specific, features explain the
heterogeneity in transcriptomes between individual cells and the
transcriptional response to perturbations.
SLAM-seq^3 involves briefly exposing cells to the nucleoside analogue
4-thiouridine (4sU). 4sU is incorporated into new RNA during tran-
scription and converted to a cytosine analogue using iodoacetamide
(IAA) before RNA sequencing. Sequencing reads originating from
new RNA can be identified within the pool of total RNA reads on the
basis of characteristic U- to-C conversions. We applied the SLAM-seq
technique to resolve the onset of lytic mouse cytomegalovirus (CMV)
infection at the single-cell level. After optimization for single-cell
sequencing (scSLAM-seq) (Fig.  1 , Supplementary Methods), we per-
formed scSLAM-seq on 107 single mouse fibroblast cells and in parallel
analysed global transcriptional changes of matched, larger (1 ×  105 )
populations of cells (n = 2) using (bulk) SLAM-seq. After quality
filtering for cells with more than 2,500 detectable genes (Extended Data
Fig. 1a), the remaining samples (49 CMV-infected, 45 uninfected cells)
displayed all the characteristics of high-quality scSLAM-seq libraries
(Extended Data Fig. 1b), including U-to-C conversion rates of between
4% and 6% (Extended Data Fig. 1c, d). Incorporation of 4sU is thus
both efficient and uniform at the single-cell level.
Owing to rates of 4sU incorporation of about 1 in 50–200 nucleo-
tides, up to 50% of all SLAM-seq reads that originate from new RNA
may not contain U-to-C conversions. To overcome this problem, we
developed ‘globally refined analysis of newly transcribed RNA and
decay rates using SLAM-seq’ (GRAND-SLAM)—a Bayesian method
to compute the ratio of new to total RNA (NTR) in a fully quanti-
tative manner including credible intervals^4 (Fig.  1 ). Here we report


GRAND-SLAM 2.0 for the parallel analysis of hundreds of SLAM-seq
libraries derived from single cells. The accuracy of quantification is fur-
ther improved by analysing long reads (150 nucleotides) in paired-end
mode (see Supplementary Methods), which allows 4sU conversions to
be reliably distinguished from sequencing errors within the overlapping
sequences (Extended Data Fig. 1c, d). We obtained accurate measure-
ments (90% credible interval < 0.2) for thousands of genes per cell,
thereby approaching the overall sensitivity of scRNA-seq (Extended
Data Fig. 1e) and achieving high correlation (R > 0.73) with bulk
SLAM-seq (Extended Data Fig. 1f).
Unbiased principal component analysis (PCA) of highly variable
cellular genes (see Supplementary Methods, Extended Data Fig. 1g)
could not separate CMV-infected from uninfected cells for either total
RNA or old RNA, and only slightly for new RNA (Fig. 2a, Extended
Data Fig. 1h–j). Intercellular heterogeneity thus exceeded the virus-
induced changes, which are hardly detectable in total RNA by
two hours post-infection (h.p.i.), owing to the slow turnover of mam-
malian mRNAs (see Extended Data Fig. 2a–d, Supplementary Methods
and Supplementary Table 1). By contrast, PCA on the NTR sepa-
rated uninfected from infected cells with high precision (Fig. 2a) and
demonstrated a clear positive correlation with the extent of viral gene
expression (Pearson’s correlation coefficient R = 0.59, P = 7.3 ×  10 −^6 )
(Extended Data Fig. 1j).
Recent findings reported that intronic reads from scRNA-seq data
can be used to estimate time derivatives of gene expression in individ-
ual cells termed ‘RNA velocities’^5. These indicate the future trajectory
of individual cells in low-dimensional projections of gene-expression
space. However, infected cells could not be separated from uninfected
cells by an unbiased PCA computed on the respective RNA veloci-
ties, or on the expression profiles projected into the future using the
velocities, or directly on intron/exon ratios (Extended Data Fig. 3a).
To compare scSLAM-seq directly with RNA velocities computed for
a larger population of cells, we performed 10x Genomics Chromium
droplet-based scRNA-seq on hundreds of uninfected (n = 793) and
CMV-infected (n = 353) cells using the same experimental conditions.
Although PCA on mature transcripts (exonic reads only) did not sep-
arate uninfected and infected cells, the distinction was possible using
intron/exon ratios (Extended Data Fig. 3b). However, no meaningful
directionalities in the RNA velocities of both the scSLAM-seq and
10x data were observed (Fig. 2b). We used new and total RNA levels
obtained by scSLAM-seq to replace intronic and exonic read levels
and determine ‘NTR velocities’. Notably, these further discriminated
infected from non-infected cells (Fig. 2c).
To compare NTRs and RNA velocity directly, we asked which of
them could best predict whether a gene was upregulated or downreg-
ulated in large cell populations. Although this was possible to some
extent using RNA velocities computed from dozens or hundreds
of cells in scSLAM-seq or 10x data, respectively (area under receiver
operating characteristic curve (AUC) values of 0.68 and 0.74), they were

(^1) Institute for Virology and Immunobiology, Julius-Maximilians-University Würzburg, Würzburg, Germany. (^2) Helmholtz Institute for RNA-based Infection Research (HIRI), Helmholtz-Center for
Infection Research (HZI), Würzburg, Germany.^3 Institute of Computational Biology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany.
(^4) Department of Mathematics, Technische Universität München, Munich, Germany. (^5) Core Unit Systems Medicine, University of Würzburg, Würzburg, Germany. (^6) These authors contributed equally:
Florian Erhard, Marisa A. P. Baptista, Tobias Krammer. *e-mail: [email protected]; [email protected]; [email protected]
18 JULY 2019 | VOL 571 | NAtUre | 419

Free download pdf