Personalized_Medicine_A_New_Medical_and_Social_Challenge

(Barré) #1

4 Integration Methods


As seen in Sect.3.1, advances of experimental techniques have generated large
amounts of data describing biological system from different aspects. Each data
source contains an important “slice” of information about the system. Analyzing
each of these layers in isolation from others has already yielded important biolog-
ical insights. Fusion of these data is expected to lead to more sophisticated insights.
This is because each type of a molecular network captures only one aspect of
cellular organization. However, there is an interaction between these networks that
makes the system function (see Fig. 1 ). For example, proteins physically interacting
in a PPI network are more likely to have correlated expression profiles, i.e., to have
an edge in the coexpression network as well.^99 On the other hand, genetic interac-
tions do not necessary coincide with physical interactions between the
corresponding gene products.^100 Therefore, integration of GI network with the
PPI network and other molecular networks has been shown to be beneficial for
many biological problems, from uncovering new associations between diseases^101
to finding new relations between GO terms.^102
Furthermore, integration enhances the quality of inference of new knowledge
from the data and diminishes inference of noise in some data sources. As we
previously mentioned, many techniques in experimental biology produce noisy
and incomplete data due to sampling and other biases in data acquisition, manage-
ment, and interpretation.^103 Therefore, a solution to overcome some of these
disadvantages, to increase the coverage of the data, and to increase the performance
of our predictions is data integration. In the next sections, we describe commonly
used machine learning computational techniques that have proven to be successful
in extracting new biological information by integration of different cellular network
layers.


4.1 Bayesian Networks for Data Integration


Bayesian networks (BNs) belong to the class ofstatistical learning models. They
are used to model relationships in the data by representing probabilistic dependence
between random variables in a weighted graph with weights corresponding to
probabilities.^104 This representation enables modeling of biological data that cap-
tures their noisy and stochastic nature. BNs are the most commonly used framework


(^99) Ge et al. ( 2003 ).
(^100) Mani et al. ( 2008 ).
(^101) Zˇitnik et al. ( 2013 ).
(^102) Gligorijevic ́et al. ( 2014 ).
(^103) de Silva et al. ( 2006 ) and Wodak et al. ( 2009 ).
(^104) Yu et al. ( 2011 ).
Computational Methods for Integration of Biological Data 157

Free download pdf