Computational Systems Biology Methods and Protocols.7z

of microscopy images [119]. To overcome the pitfalls associated with conventional machine learning classifiers, a deep convolutional neural network (DeepLoc) is improving to analyze yeast cell images for automated classification of protein subcellular localization [120]. And a deep neural network is also applied to prospectively predict lineage choice in differentiating primary hematopoietic progenitors before conventional molecular markers are observable, by using image patches from bright-field microscopy and cellular movement [121]. Especially, a deep learning algorithm has been primarily used surface area information from magnetic resonance imaging of the brain of young individuals to efficiently predict the diagnosis of autism in old individual high-risk children [122]; and a single CNN, trained end to end from images directly, using only pixels and disease labels as inputs, can classify skin cancer with a level of competence comparable to dermatologists [123]. Although the deep learning has shown satisfied potential for analyzing omics data [124], the characteristic of biological high- throughput data as “small-sample high-dimension” is still a big challenge (seeNote 3), and the “black box” of deep learning or other machine learning methods has usually hidden many useful readable information for biological or biomedical researches. Thus, it would be important to use multiple data resources to consistently improve collective health [125] in a discriminative and interpreta- tive manner.

4 Notes

Totally, the machine learning plays an important role in current biological and biomedical researches. Especially, these computer- advanced technologies will be efficient to analyze the big biological data. However, different from conventional big social data, the big omics data are always “small-sample-high-dimension”, which cause overwhelming application problems and also introduce new challenges.

The sample unbalance problem is usually discussed in the
modeling of machine learning; some available solutions are
resampling, one-class model or anomaly detection. But, in big
biological data, the “extremely unbalance” problem exists,
such as rare mutations or rare diseases, which is hard to obtain
enough positive samples.Thus, the prior-knowledge integrated
methods are required to provide transferable learning methods to
borrow (combine) multiple sources of data to assist the solution of
single-sample analysis.

A large number of machine learning models are “black box,”
which is enough to apply in social applications. However, in
biological fields, the molecular mechanism underlying any

Revisit of Machine Learning Supported Biological and Biomedical Studies 197

Computational Systems Biology Methods and Protocols.7z

Get our desktop app

Company

Features

Documentation

Resources