3 Methods
3.1 Variants of
Machine Learning
Methods in Biological
and Biomedical Study
To apply machine learning methods in biological study, different
variants have been proposed to address particular problem appeared
in the biological questions. According to the biological back-
ground, there will be several application scenarios, e.g., sequence
analysis, image analysis, interaction analysis, disease analysis, and
annotation analysis. The following introduction will illustrate the
wide strategy of applying machine learning in each scenario, rather
than conventional survey on models of machine learning.
3.1.1 Application in
Sequence-Focused
Analysis
Based on general-purpose machine learning algorithms and
libraries, many software packages have been designed to learn
genotype-to-phenotype predictive models from sequences with
known phenotypes [46, 47], which can computationally judge the
genetic bases of phenotypes [36]. One is to recognize the regu-
latory elements on the biological sequence: to select appropriate
features of promoters that distinguish them from non-promoters,
the nonlinear time series descriptors along with nonlinear machine
learning algorithms, such as support vector machine (SVM), are
used to discriminate between promoter and non-promoter regions
[48]; and a machine learning approach, MutPred Splice, has been
developed to recognize coding region substitutions that disrupt
pre-mRNA splicing, which can be applied to detect the splice site
loss [49]; and a classifier is built and trained by using the enhancer
set and identified related enhancers based on the presence or
absence of known and putative TF binding sites, which combine
the machine learning and evolutionary sequence analysis [50]; and
an ortholog prediction meta-tool, WORMHOLE, is to integrate
distinct ortholog prediction algorithms into meta-tools to identify
novel least diverged orthologs (LDOs) with high confidence
[51]. Two is to predict important proteins from sequences due to
the difficulty on wet experiment: the top-performing methods
based on machine learning approaches have been built to tackle
both the detection of transmembrane beta-barrels in sets of
Table 2
(continued)
Methods Description URL
Metaml The software framework can analyze microbiome profiles
and metadata for thousands of samples [44]
http://segatalab.cibio.
unitn.it/tools/metaml
Hierarchical
boosting
A machine learning classification framework can combine the
selection tests to detect the features of polymorphism in
hard sweeping with controls on population-specific
demography [45]
http://hsb.upf.edu/
Revisit of Machine Learning Supported Biological and Biomedical Studies 189