Computational Systems Biology Methods and Protocols.7z

(nextflipdebug5) #1
experiments for read alignment, transcript quantification, quality
control, normalization, and downstream modeling. For some of
these steps, pipelines and tools that have been developed for bulk
RNA-seq data can be reused. However, some important single-cell-
specific aspects and pitfall need to be considered.

2.1.1 Read QC and
Alignment


Read QC and alignment is the first computational step in analyzing
RNA-seq data sets, no exception for scRNA-seq. In general, most
of the methodology developed for bulk RNA-seq, including
insights for how to best map the raw sequencing reads, such as
TopHat [118], can be reused for scRNA-seq. However, like pro-
cessing bulk RNA-seq reads, it is important to consider biases such
as incomplete knowledge of the target genome or transcriptome
annotation [119]. For the specific designation of scRNA-seq pro-
tocols, spike-in RNAs, such as ERCC [120] or unique molecular
identifiers (UMIs) [121] are always used to help reduce the techni-
cal variation and produce more accurate quantification. If the syn-
thetic spike-in RNAs are used, the reference genome should be
augmented with the DNA sequence of the spike-in molecules
before mapping. Alternatively, if the UMI is used, the barcode
attached to each read should be removed for alignment. Specifically,
if both of them are used in conjunction, it needs to ensure that the
sequences at the ends of the spike-ins are complete. Otherwise, the
expression level of spike-ins will be underestimated.

2.1.2 Read Counting The mapped reads can be summarized to generate read counts
using the same approaches that are applied in conventional
RNA-seq, such as HTSeq [122]. When UMIs are used, these


Fig. 3Flowchart of scRNA-seq data analyses. The first steps (orange) are
general for any high-throughput sequencing data. Later steps (blue) require a
mix of existing RNA-seq analysis methods and novel methods to address the
technical difference of scRNA-seq. The biological interpretation (red) should be
analyzed with methods specifically developed for scRNA-seq

336 Yungang Xu and Xiaobo Zhou

Free download pdf