Computational Systems Biology Methods and Protocols.7z

(nextflipdebug5) #1
from experiments using different experimental parameters (e.g.,
sequencing length and depth) or quality control criteria. This hin-
ders the integration of these datasets for data mining or knowledge
discovery.
To further control data quality, both sample reduction and
feature reduction (Subheading4) need to be performed typically
after the normalization process. One commonly used criterion is to
remove samples with library sizes less than a threshold. Another
criterion is to remove samples containing high-content mitochon-
drial RNA (usually 30% or above). These samples are usually taken
as cells undergoing apoptosis [6], which was not supported by a
recent study. In that study, Lin Liu et al. analyzed a colon cancer
scRNA-seq dataset (SRA: SRP113436) and discovered that several
cells containing high-content mitochondrial RNA were not likely
to undergo apoptosis (Fig.1). These cells were identified to be
intact by microexamination, and two of them were CD133-positive
(CD133+). The CD133 protein is often used as a marker for CSCs.

3 Normalization Methods


Gene expression raw data from high-throughput technologies (e.g.,
microarray or RNA-seq) must be normalized to remove technical
variation so that meaningful biological comparisons can be made.
Currently, both bulk RNA-seq and scRNA-seq only consider to
remove the technical variation caused by RNA capture efficiency,
cDNA amplification bias, sequencing depth, batch effect, etc. How-
ever, it is also necessary to remove uninterested biological variation,
which could be confounded with biological variation of interest.

Fig. 1Total, cellular, nuclear, and mitochondrial RNA. Library size represents total RNA including ERCC RNA
and cellular RNA. The latter includes nuclear RNA and mitochondrial RNA. Cellular RNA represents the total
count of reads aligned to the nuclear genome (nuclear RNA) and mitochondrial genome (mitochondrial RNA).
Cellular RNA proportion is the proportion of cellular RNA to library size. Mitochondrial proportion is the
proportion of mitochondrial RNA to cellular RNA. Red solid circles represent single cells from tumor tissues.
Hollow circles represent single cells from control tissues. Single cells (in the upper right rectangular)
containing high-content cellular RNA and mitochondrial RNA were not likely to undergo apoptosis


314 Shan Gao

Free download pdf