Computational Systems Biology Methods and Protocols.7z

Chapter 7

Integrative Analysis of Omics Big Data

Xiang-Tian Yu and Tao Zeng

Abstract

The diversity and huge omics data take biology and biomedicine research and application into a big data era,
just like that popular in human society a decade ago. They are opening a new challenge from horizontal data
ensemble (e.g., the similar types of data collected from different labs or companies) to vertical data
ensemble (e.g., the different types of data collected for a group of person with match information),
which requires the integrative analysis in biology and biomedicine and also asks for emergent development
of data integration to address the great changes from previous population-guided to newly individual-
guided investigations.
Data integration is an effective concept to solve the complex problem or understand the complicate
system. Several benchmark studies have revealed the heterogeneity and trade-off that existed in the analysis
of omics data. Integrative analysis can combine and investigate many datasets in a cost-effective reproduc-
ible way. Current integration approaches on biological data have two modes: one is “bottom-up integra-
tion” mode with follow-up manual integration, and the other one is “top-down integration” mode with
follow-up in silico integration.
This paper will firstly summarize the combinatory analysis approaches to give candidate protocol on
biological experiment design for effectively integrative study on genomics and then survey the data fusion
approaches to give helpful instruction on computational model development for biological significance
detection, which have also provided newly data resources and analysis tools to support the precision
medicine dependent on the big biomedical data. Finally, the problems and future directions are highlighted
for integrative analysis of omics big data.

Key wordsIntegration, Omics, High throughput, Big data, Complex diseases, Bayesian, Matrix decomposition, Machine learning, Subtype, Precision medicine

1 Introduction

High-throughput screening is one of the primary technologies for exploring complex intracellular dynamics in modern biology, and the data produced by such approaches are usually called as omics data [1]. The intuitive omics on genome appeared from the Human Genome Project for obtaining the blueprint of complete human genetic information; after which, the transcriptome and proteome are also becoming available to measure the expression abundance of mRNA and protein, respectively [2]. Lately, the epigenomics was

Tao Huang (ed.),Computational Systems Biology: Methods and Protocols, Methods in Molecular Biology, vol. 1754,
https://doi.org/10.1007/978-1-4939-7717-8_7,©Springer Science+Business Media, LLC, part of Springer Nature 2018

109

Computational Systems Biology Methods and Protocols.7z

Chapter 7

Get our desktop app

Company

Features

Documentation

Resources