Computational Systems Biology Methods and Protocols.7z

kernel parameters for each data type beforehand [27]. And in a biological application, the high-throughput screens for mRNA, miRNA, and proteins have been jointly analyzed using factor analysis, combined with linear discriminant analysis (LDA), to identify the molecular characteristics of cancer [22]. Especially when focused on characterizing biological network, an algorithm JointCluster is implemented to find sets of genes that cluster well in multiple networks of interest, such as co-expression networks summarizing correlations among the expression profiles of genes and physical networks describing protein-protein and protein-DNA interactions among genes or gene products [28]. To produce a comprehensive view of a given disease by diverse types of genome-wide data, similarity network fusion (SNF) has been inspired from the theoretical multi-view learning framework to construct the networks of samples (e.g., patients) for each data type and fuse them into one network, which can represent the sample patterns underly- ing data [102]. Recently, a new framework called “pattern fusion analysis” (PFA) has been proposed to perform auto- mated information alignment and bias correction and to fuse local sample patterns (e.g., from each data type) into a global sample pattern corresponding to phenotypes (e.g., across most data types). Particular, PFA can identify common and comple- mentary sample patterns from different omics profiles by opti- mally adjusting the effects of each data type based on the local tangent space alignment (LTSA) theory [103].

3.Matrix-based integration model. Previously, the integrative
scheme of ping-pong algorithm was proposed to integrate
more than one type of data from the same biological samples,
which is dependent on the usage of co-modules describing
coherent patterns across paired datasets [29]. Actually, these
methods can be included into several classes according to the
type of applied matrix decomposition: one is a joint (nonnega-
tive) matrix factorization technique that projects multiple types
of genomic data onto a common coordinate system, in which
heterogeneous variables weighted highly in the same projected
direction form a multidimensional module (md-module) [21];
two is higher-order generalized singular value decomposition
(GSVD), which is designed for efficient, parameter-free and
reproducible identification of network modules simultaneously
across multiple conditions [104, 105]; and three is rank matrix
factorization as multi-view bi-clustering to model subtyping
and recognize subtype-specific features simultaneously, e.g.,
integrate mutational and expression data while taking into
account the clonal properties of carcinogenesis [30].

Integrative Analysis of Omics Big Data 123

Computational Systems Biology Methods and Protocols.7z

Get our desktop app

Company

Features

Documentation

Resources