Computational Systems Biology Methods and Protocols.7z

(nextflipdebug5) #1

Preface


With the rapid development of high-throughput technologies, such as next-generation
sequencing and single-cell sequencing, many tough biomedical questions can be answered
since it is no longer impossible to get a whole picture of the biological system. Complex
diseases, such as tuberculous meningitis and leukemia, involve dysfunctions on multiple
levels, including DNA variants, mRNA differential expression, and protein fluctuation.
Accurately measuring these molecules is the first step of understanding the biological
system.
But even if we can get all these multi-omics data, the bioinformatics analysis of such big
data is still very challenging. There are two types of analysis for deciphering the mechanism
hidden behind the biomed big data. One method is machine learning. It can analyze various
features and build a predictive model which can predict the response of a biological system to
a perturbation or classify the subtypes of samples. In recent years, one of the machine
learning methods, deep learning, is extremely popular and has become a powerful tool for
big data analysis.
Another effective method is network analysis based on graph theories. Network is how
we understand the complex world. It starts from a node. And a connection in real life is
abstracted as an edge. It can grow fast and become more and more complex. Eventually, it
will exhibit unique properties and reflect the complex system. It inspires the development of
many algorithms, such as the neural network in deep learning. And in biomedicine, it is a
wonderful way of integrating diverse big data and transforming the biological questions into
mathematical questions, especially graph theory questions. The graph theory empowers the
network analysis to see the hidden truth underneath the hairy ball we see. The visualization
of a large-scale network can help us get a sense of the network, but it can’t really give us the
useful information that we are interested in, such as which genes are the key drivers and
which genes are novel disease genes or possible drug targets.
In this book, we introduce the latest experimental and bioinformatics methods for DNA
sequencing, RNA sequencing, cell-free tumor DNA sequencing, single-cell sequencing, and
single-cell proteomics and metabolomics. Then, we review the advanced analysis methods,
such as genome-wide association studies (GWAS), machine learning, reconstruction and
analysis of gene regulatory networks, and differential coexpression network analysis, and
give a practical guide for how to choose and use the right algorithm or software to handle
specific high-throughput data or multi-omics data. A powerful novel RNA-seq data analysis
and visualization tool, iSeq, is released in this book. The last parts of the book are the
applications of these high-throughput technologies and advanced analysis methods in
complex diseases, such as tuberculous meningitis and leukemia.
We hope that after reading this book, the readers can understand: how the biomed big
data is generated, which tools can be used to process them, which advanced machine
learning and network analysis are optional for data integration and knowledge discovery,
and what achievements have been made nowadays.


Shanghai, China Tao Huang


v
Free download pdf