Computational Systems Biology Methods and Protocols.7z

(nextflipdebug5) #1

3.1.4 Application in
Disease-Focused Analysis


Machine learning techniques are becoming an alternative approach
in medical diagnosis or prognosis. On one hand in the purpose of
disease diagnosis, a molecular test is built to distinguish usual
interstitial pneumonia from other interstitial lung diseases in surgi-
cal lung biopsy samples due to interstitial lung diseases having
similar radiological and histopathological characteristics and also
select the patient having to undergo surgery [67]; and to overcome
a large imbalance of negative cases versus positive cases (seeNote
1 ), the usage of an ensemble-based approach rather than a single
classifier has been constructed with bagging, and a simple majority
vote achieves a small positive effect on the accuracy rate depending
on the studied diseases [68]; and to deal with redundant informa-
tion and improve classification, a gene selection method, Recursive
Feature Addition, is proposed to determine the final optimal gene
set for disease prediction and classification [69]; and to assist phy-
sicians’ subjective experience for the occurrence of skeletal-related
events (SREs), the machine learning models (e.g., LR, DT, and
SVM) ranked visual analog scale (VAS) as a key factor to assess the
associations of clinical variables for predicting SREs risk groups
[70]. On the other hand, in the purpose of disease prognosis, the
ensemble classifier based on many logistic regression classifiers is
applied to integrate mutation status with whole transcriptomes for
high-performing prediction of NF1 inactivation in glioblastoma
(GBM) with targeted therapies and personalized medicine [71];
and to assess response earlier in the treatment regimen avoiding no
longer be surgically resectable, the Bayesian logistic regression is
learned on the available clinical and quantitative MRI data to
distinguish breast cancer responders from nonresponders after the
first cycle of treatment [72]; and to tailor the prescription of
prophylactic inguinal irradiation (PII) in deciding if deliver or not
the PII in the treatment of anal cancer patients, the machine
learning-based model (e.g., logistic regression, J48, random tree,
and random forest) used a large set of clinical and therapeutic
variables to obtain better performances [73]; and to capture deep
molecular basis of clinical heterogeneity or specific therapeutic
targets in clinical outcome models, the supervised learning predic-
tion methods are required to delineate patients within specific risk
categories who were likely to be cured or to die of their disease
[9]. Besides, the computational methods in drug discovery are also
accelerating drug-target prediction. Based on sequence-derived
protein features, the most commonly used machine learning meth-
ods have been applied to predict whether a protein is druggable,
and the feature selection procedures were used to provide the best
performance of each classifier according to the optimum number of
features [74]. And the identification of disease genes among the
candidates remains time-consuming and expensive in the conven-
tional way, so that ProDiGe, a new algorithm for Prioritization of
Disease Genes, implements a new machine learning strategy based
on learning from positive and unlabeled examples [75].

192 Xiang-tian Yu et al.

Free download pdf