Computational Systems Biology Methods and Protocols.7z

(nextflipdebug5) #1

Recently, the conventional noncoding information or “dark
matter” on genome has also been attractive and inspiring to recover
many unknown regulatory factors. One is the miRNA, and the
miRBase [47] database publishes predicted hairpin portion of a
miRNA transcript, with information on the location and sequence
of the mature miRNA sequence. Second is the lncRNA, and the
lncRNAdb [48] is a manually curated reference database dependent
on capturing a great proportion of the literature describing func-
tions for individual eukaryotic lncRNAs. Third is the methylation,
and the NGSmethDB [49] is a repository with single-base whole-
genome methylome maps on the best-assembled eukaryotic gen-
omes and the reliable and high-quality methylomes; meanwhile, the
MethylomeDB [50] is an expert database containing genome-wide
brain DNA methylation profiles of human and mouse brain speci-
mens generated from in-house and collected from third-party
publication.
Lately, along with the development of central dogma, the
metabolism as the outcomes of regulation can reflect more
phenotype-associated genetic information. For example, the
Human Metabolome Database (HMDB) [51] is a free database
gathering human-source small molecule metabolites, which con-
tains or links chemical data, clinical data, and molecular biology/
biochemistry data, and can be applied in biomarker discovery.
Similarly, EBI metagenomics [52] is a freely available center for
the storage and analysis of WGS sequenced meta-genomic/meta-
transcriptomic data and also provides a standardized analysis work-
flow to produce rich taxonomic diversity and functional annota-
tions with great consistence on different types of data.
In addition, from the systematical viewpoint on all biological
elements, their associations or interactions can be summarized and
abstracted as a network form, which inspire the network biology
[53–57], and the integrative resources of such biological network
knowledge can be obtained from several public databases, such as:


The Biological General Repository for Interaction Datasets (Bio-
GRID) [58] which is an open access database dedicated to the
annotation and archival of protein, genetic, and chemical inter-
actions for all major model organism species and humans, by
reviewing the biomedical literature for major model organism
species.


The STRING database [59] which tries to provide a critical assess-
ment and integration of protein-protein interactions, including
direct (physical) as well as indirect (functional) associations,
especially the inferred protein-protein associations from
co-expression data.


The KEGG [60] which is an encyclopedia of genes and genomes,
designed to assign functional meanings to genes and genomes
both at the molecular and network level in the form of molec-
ular interactions, reactions, and relations.


Integrative Analysis of Omics Big Data 113
Free download pdf