fingerprints and five machine learning methods (SVM, DT, RF,
kNN, and NB) [91]. The best binary and ternary models were
both developed by using MACCS keys andkNN, which yielded
the prediction accuracy of 83.91% and 80.46%, respectively.
4 hERG
hERG (the human enter-a-go-go-related gene) encodes a voltage-
dependent potassium channel that mediates a delayed rectifier
potassium current (Ikr) in cardiomyocytes. Blockage of hERG
channel is considered to be the primary factor for drug-induced
prolongation of QT interval, which can cause sudden death in
extreme situation. Several non-antiarrhythmic drugs, such as cisa-
pride [92], caused death induced by the blockage of hERG channel
and were withdrawn from the market. A variety of compounds
covering a broad spectrum of therapeutic groups were also con-
firmed to block hERG [93]. Therefore, scientists should develop
strategies to assess hERG blocking at the early stage of drug dis-
covery process to avoid investing at risky lead series.
Several methods have been established to assess the potency of
compounds to block hERG channel, including in vitro methods
such as rubidium-efflux assay, radioligand binding assay,
fluorescence-based assay, the whole cell patch-clamp assay, and
in vivo methods such as electrocardiography (ECG) approaches.
ECG and the patch-clamp technique are low-throughput and not
suitable for screening lead compounds in the early phase of drug
development. The rubidium-efflux assay, radioligand binding assay,
and fluorescence-based assay have the advantages of high-
throughput and low cost, but the correlation with the membrane
patch-clamp and ECG is poor. It needs to mention that all in vitro
testings are based on cells, and therefore the properties of cells
would have an important impact on the experimental results. For
example, the IC 50 value that a compound blocks hERG may even
have a deviation of 100-fold inXenopusoocytes [94], making these
data untrustable.
Compared with in vitro and in vivo experiments, computa-
tional models cost less time and expense. Hundreds of in silico
models have been established, which can be divided into three
categories, including homology modeling, QSTR model, and
STR model [92].
4.1 Homology
Modeling of hERG
Homology modeling is a comparative modeling procedure to con-
struct a three-dimensional model for a protein sequence based on
the structures of homologous proteins. The homology-derived
models combined with docking and molecular dynamics simulation
can be used to calculate the binding affinities and investigate the
biochemical mechanisms of ligands.
Machine Learning-Based Modeling of Drug Toxicity 255