Computational Systems Biology Methods and Protocols.7z

(nextflipdebug5) #1
gene interactions with disease-lncRNA relationships. Then, they
prioritized disease-related lncRNA by using the random walking
with restart (RWR) algorithm. Notably, LncPriCNet still performs
well when information on known disease lncRNAs is insufficient.
The reason may be that it considered the global functional interac-
tions of the multi-level composite network. It is well known that
special disease related with special tissue. Based on this knowledge,
Ganegoda et al. [51] presented a computational method, KRWRH,
to predict disease-lincRNA interactions based on phenotype infor-
mation and lincRNA tissue expression details. They used Gaussian
interaction profile kernel to calculate the similarity of diseases and
lincRNAs, respectively. Then, the random walk with restart method
is utilized to infer lincRNA-disease interactions.

3.2 Machine
Learning Methods


Machine learning is a useful tool to prioritize candidate lncRNAs by
training classifiers with features of known disease-related lncRNAs
and unknown lncRNAs. Supervised machine learning prioritizes
candidate lncRNAs based on the differences between disease-
related lncRNAs and unknown lncRNAs of biological features
[52–55].
Zhao et al. [56] proposed a computational model for cancer-
related lncRNA identification by integrating genome, regulome,
and transcriptome data. The naive Bayesian classifier was employed
to classify lncRNA, and Database for Annotation, Visualization and
Integrated Discovery (DAVID) was used for enrichment analysis.
The results showed integration of multi-omic data can improve the
performance of cancer-related lncRNA prediction. In addition,
they predicted 707 potential cancer-related lncRNAs and found
that these lncRNAs tend to exhibit significant differential expres-
sion and differential DNA methylation in multiple cancer types and
prognosis effects in prostate cancer.
Considering the imbalance between known and unknown
lncRNA-disease interactions, Lan et al. [57] presented a positive-
unlabeled (PU) learning for discovering lncRNA-disease associa-
tions based on multiple data resources. Two lncRNA similarity and
five disease similarity methods were employed to calculate simila-
rities between lncRNA and lncRNA and between disease and dis-
ease, respectively. They used the geometric mean of matrix to fuse
lncRNA and disease similarities, respectively. The bagging SVM is
employed to identify potential lncRNA-disease associations. Fig-
ure2 shows the flowchart of LDAP. Finally, this method is imple-
mented as a web server (http://bioinformatics.csu.edu.cn/ldap)
for new lncRNA-disease prediction. The LDAP took the input
lncRNA sequence in FASTA format, either a pasted sequence or a
file with multiple sequences (size limit<50 kb). The user can paste
the sequence with FASTA format into textbox (1) and click the
“submit” button (2) to submit single sequence. Figure3 shows the
usage guideline of LDAP by pasting a sequence. When the user

212 Wei Lan et al.

Free download pdf