Computational Systems Biology Methods and Protocols.7z

changes in aromatic ring substituents could reduce the potency for hERG binding [105]. This conclusion is consistent with the obser- vations of Braga et al. based on 4980 compounds, which indicated that removing carbons, changing the electronic environment around the basic nitrogen, and adding a hydroxyl group could reduce the potency of a compound inhibiting hERG [106]. A number of STR models have been developed by multiple machine learning methods, such askNN, ANN, SVM, and RF, for the prediction of hERG blockage [98]. The first STR model was published by Roche et al., in which three classes were set with the cutoffs IC 50 ¼ 1 μM and IC 50 ¼ 10 μM[107]. The PLS, self- organizing maps, principal component analysis, and supervised neural networks were adopted to build classification models. Among them, the model using supervised neural networks showed the best performance, in which 93% of nonblockers and 71% of blockers were predicted correctly. Li et al. docked 495 compounds in a homology model of hERG based on the KvaP template and calculated pharmacophore-based GRIND descriptors, including hydrophobic interaction, hydrogen bond acceptor and donor, and molecular shape descriptors [108]. Then, the descriptors were applied into a SVM classifier to establish classification models at thresholds of 1, 5, 10, 20, 30, and 40μM, respectively. The model was tested on an external set of 66 compounds and a large data set containing 1948 compounds and achieved the accuracy values of 72% and 73%, respectively. Wang et al. used NB and recursive partitioning (RP) to establish hERG classification model based on 806 compounds [109]. When the threshold was 1μm, the Bayesian classifier based on 14 molecular properties and LCFP_8 fingerprint achieved the highest global accuracy of 91.5% for the training set and 88.3% for the test set.

5 Conclusions

Nowadays, a variety of in silico models for acute toxicity have been established with the aim of saving experimental resources in the early stage of drug development. However, the prediction accuracy is difficult to achieve a major breakthrough due to lack of suffi- ciently large data sets. Therefore, most of the previous prediction models improved the performance by limiting the model coverage. The future efforts will be devoted to enrich the data set with diverse structures and broad activity distribution. Cancer is one of the leading causes of death, and it is necessary to identify chemical carcinogenicity as early as possible. The effi- ciency of machine learning models for carcinogenicity depends on the reliable and sufficient experimental data. In general, in silico models for nongenotoxic carcinogenicity performed inferior to those for genotoxic carcinogenicity. Moreover, global models

258 Jing Lu et al.

Computational Systems Biology Methods and Protocols.7z

Get our desktop app

Company

Features

Documentation

Resources