learning methods, random forests with optimized parameter are
one of the best models at classifying cancer samples at gene expres-
sion microarray data [104], also efficient at microbiome data.
3.2.3 Deep Learning in
Precision Medicine Study
Deep learning-based technologies have been successfully applied to
learn the hidden representations of data with multiple levels of
abstraction, which achieve great improvements in the conventional
machine learning application fields, especially in domains such as
drug discovery [105], regulatory genomics [106], computational
biology [107], bioinformatics [108], human healthcare [109], and
so on.
As the traditional data sources of biology, the genetic sequences
can provide a large number of samples to feed the deep learning
models [110]. For DNA sequences, a hybrid architecture combin-
ing a pre-trained, deep neural network and a hidden Markov model
(DNN-HMM) has been built for the de novo identification of
replication domains on DNA [111]; an open-source package Basset
with deep convolutional neural networks is developed to learn the
functional activity of DNA sequences from genomic data, e.g.,
DNase-seq, especially to annotate and interpret the noncoding
genome [112]; and a deep learning-based hybrid architecture,
BiRen, can predict enhancers using the DNA sequence alone
[113]. For RNA sequences, a general and flexible deep learning
framework for modeling structural binding preferences and pre-
dicting binding sites of RBPs takes (predicted) RNA tertiary struc-
tural information into account for the first time [114]; and DanQ, a
hybrid convolutional and bidirectional long short-term memory
recurrent neural network framework, is constructed to predict
noncoding function de novo from sequence by learning regulatory
“grammar” from the long-term dependencies between the
sequence motifs [115]. Besides, for the protein sequences, a new
deep learning method that predicts protein contact maps by inte-
grating both evolutionary coupling and sequence conservation
information is designed as an ultra-deep neural network to model
contact occurrence patterns and complex sequence-structure rela-
tionship and has shown better quality than conventional template-
based models [116]; and a computational program DeepConPred
employed an effective scheme of two novel deep learning-based
methods to identify optimal and important features for long-range
residue contact prediction [117].
Recently, the precision medicine is developing rapidly, and
many biological and biomedical images provide a new opportunity
to introduce deep learning for enhancing the clinical practices
[118]. Based on the high-content screening (HCS) technologies,
large-scale imaging experiments are capable to study cell biology
and for drug screening, and an approach combining deep convolu-
tional neural networks (CNNs) with multiple-instance learning
(MIL) is used to classify and segment such hundreds of thousands
196 Xiang-tian Yu et al.