Personalized_Medicine_A_New_Medical_and_Social_Challenge

(Barré) #1

FLN forSaccharomyces cerevisiaeby computing the likelihood that genes are
functionally linked based on the information from heterogeneous data sources.
Specifically, they used PPI network, functional linkage from the literature, phylo-
genetic profiles, microarray data, and four other types of data. The resulting
network revealed many protein associations different from their physical interac-
tions. These associations were confirmed based on their common participation in
KEGG pathways. A similar strategy was applied in the construction of the human
FLN by integrating 16 different genomic features.^112 This network was further used
to prioritize candidate genes for 110 diseases. The top-ranked candidate genes were
postulated as potential candidates to explain the mechanisms of these diseases and
to develop appropriate therapies.
One of the pioneering works on integration of clinical and genomic data via
Bayesian networks is presented by Gevaertet al.( 2006 ). The authors integrate
clinical data with microarray data (gene expression levels) to guide the clinical
management of cancer. Specifically, they use publicly available data on breast
cancer patients to classify them into a good and poor prognosis groups. They
discuss three different integration strategies:full(orearly) data integration com-
bines data sources into one data source on which BN model is built;decision
(orlate) data integration builds separate models for each data source, and the
probabilities are linearly combined using coefficients for weighting. This late
integration approach allows for taking a relative importance of each data source.
These coefficients are trained using the model building data set and randomization.
Apartial(orintermediate) data integration combines data through inference of a
joint model. In this case, BN structure from clinical data and BN structure from
microarray data are merged via a joint variable:outcome. Then the parameter
learning is done on this combined structure. These three methods are evaluated
and compared by computing the area under the ROC curve (AUC). They reported
that the intermediate integration is the most promising, leading to the best value of
AUC of 0.845.
A recent study demonstrates a great potential of Bayesian approach in predicting
single nucleotide polymorphism (SNP)genotypes of individuals solely from the
expression data of genes associated withexpression quantitative trait loci(eQTLs)
(specific location on a chromosome responsible for regulation of mRNA expres-
sion).^113 An SNP is a DNA sequence variation occurring commonly within a
population (1 %) in which a single nucleotide (guanine (G), adenine (A), thymine
(T), or cytosine (C)) in the genome differs between individuals. SNP variations
have been shown to be crucial to the development of personalized medicine
because they can affect the development of a disease in an individual and the
response to pathogens, drugs, and chemicals. Therefore, having SNP genotypic
information, along with other clinical data, on a patient could be valuable.


(^112) Linghu et al. ( 2009 ).
(^113) Schadt et al. ( 2012 ).
Computational Methods for Integration of Biological Data 161

Free download pdf