Computational Systems Biology Methods and Protocols.7z

approximately 80,000 drugs and drug-like compounds [4]. Accelrys Toxicity database collects six types of toxicological data for more than 0.17 million compounds from the open scientific literature, including acute toxicity, mutagenicity, tumorigenicity, skin and eye irritation, reproductive effects, and multiple dose effects [5]. In contrast, most of the open databases for chemical toxicities are much smaller, such as TOXNET [6] and ISSCAN [7, 8]. It is estimated that the number of possible compounds has already reached 10^60 [9]. This poses a large challenge for toxicity evaluations of compounds using experimental screenings, which need to use compound entities and multiple biochemical assays. Therefore, particular interests have been raised to develop quick and effective in silico models based on the information available for recorded compounds, and supplement in vivo and in vitro testing for predicting the toxicities of new compounds. For example, the computational chemistry tools can use chemical structures and experimental toxicities of the compounds as input data, calculate molecular descriptors, and build prediction models using machine learning methods. In this review, we discuss some illustrative models employing machine learning methods against a series of toxicity end points, including acute toxicity, carcinogenicity, and inhibition of the human ether-a-go-go-related gene ion channel (hERG).

2 Acute Toxicity

Estimation of acute toxicity is one of the most common tasks in the safety assessment of drug R&D, which represents the adverse changes occurring immediately or within a short time after a single dose of a compound or multiple doses given within 24 h [10]. Median lethal dose (LD 50 ) or median lethal concentration (LC 50 ) is common criterion for evaluating acute toxicity of compounds in multiple species. The US EPA defined the toxicity cate- gories based on LD 50 or LC 50 in 2014 (Table 1)[11]. The compounds in Category I are considered highly toxic and fatal if swallowed or inhaled. Category II means moderately toxic, and Category III indicates slightly toxic, while Category IV is practi- cally nontoxic [12]. Due to ethical reasons for avoiding the use of animals, the alternatives, such as in silico models, are strongly recommended by FDA, NIH, and EMEA [13–16]. Some pub- lished models built by machine learning methods for acute toxicity are discussed in detail below.

2.1 Quantitative
Structure-Toxicity
Relationship (QSTR)
Models for Acute
Toxicity

For congeneric compounds with the same scaffold or mechanisms, most of the relationship between the toxicity values and structural descriptors approximates linear, and simple machine learning models, such as multiple linear regression (MLR) [17–19] and partial least squares regression (PLSR) [20, 21], can generally achieve good performance [22]. Furthermore, such linear regression

248 Jing Lu et al.

Computational Systems Biology Methods and Protocols.7z

Get our desktop app

Company

Features

Documentation

Resources