Personalized_Medicine_A_New_Medical_and_Social_Challenge

(Barré) #1

Having computed matrix factors,GandS(i.e., block matricesGiandSij), two
types of inference are possible: (a) the first type uses a reconstructed relation matrix,
Rijrec¼GiSijGjTto predict new relations between objects of typesiand objects of
typej. This problem is known as thematrix completion problem, and it is analogous
to constructing arecommender system.^125 For example, in movie rating, the goal is
to predict a preference that a user would give to a movie based on some sample of
observed preferences. Similarly, some entries in the original relation matrix,Rijare
observed, while some are missing, and they can be predicted from the reconstructed
relation matrix,Rijrec. (b) The second type uses a cluster indicator matrix,Gi,to
construct clusters and infer from them new associations between objects of the
same type.
An excellent example of matrix factorization-based data integration applied on
the problem of disease-disease association prediction and disease classification is
presented by Zˇitniket al.( 2013 ). The authors integrated data on four different
biological objects: genes, diseases, GO terms, and drugs. They constructed three
relation matrices: gene-GO term annotations, drug-target relations and gene-
disease relations. Molecular networks, PPI, metabolic, gene coexpression, and
cell signaling networks were integrated as constrains into the integration frame-
work, along with drug-drug interaction data, Disease Ontology relations between
DO terms (diseases) and Gene Ontology relations between GO terms. From the
disease clustering indicator matrix, they grouped clusters into disease classes.
Disease members in these classes were shown to exhibit significant comorbidity.
Moreover, the obtained disease-disease associations were successfully evaluated
against DO, recovering 80 % of known disease-disease associations. The authors
also estimated the influence of each data type on the model’s predictive


Fig. 5The main idea of
data integration via matrix
factorization is illustrated
on three data sets (i.e.,
molecular networks
connected by external
relations) whose cluster
indicator matrices areG 1 ,
G 2 , andG3.Matrix factor,
G 2 , shown inred, is shared
in the reconstruction of
relation matricesR 12 and
R 23


(^125) Koren et al. ( 2009 ).
Computational Methods for Integration of Biological Data 169

Free download pdf