Computational Drug Discovery and Design

The generation of a conformational ensemble involves two steps. First is to run an MD simulation on the target structure to generate a trajectory. This atomic trajectory can be either a single extensive trajectory, coming from a single simulation, or multiple trajectories combined together to form a single long trajectory [23]. The two approaches can lead to different outcomes, and it seems that using multiple simulations can provide more improved conformational sampling over the use of a single MD trajectory [31]. The next step, involves the clustering of the long trajectory obtained from the simulation. This clustering step not only identi- fies representative conformations for the target protein, but is also used to reduce the computational time in subsequent binding site evaluation steps [38]. The most common way for clustering protein conformations is using RMSD-based methods. Other methods may also involve principal component analysis (PCA) [46], non- negative matrix factorization (NMF) [47], and independent component analysis (ICA) [46, 48].

2.5 The Sampling
Problem

In many cases and depending on the plasticity of the target, a binding site that is not obviously present in a crystal structure may appear after a short simulation time. For example, Eyrisch and Helms identified transient pockets on the protein surface after 10 ns of MD simulation [20]. However, in most cases, conventional MD simulations cannot access these sites, and a signifi- cantly long MD simulation may be required to sample the conformational space of the target (seeNotes 3and 4 ). This is mainly due to the entrapment of the protein structure in a local minimum within the energy surface and not being able to cross the high-energy barriers separating these minima (Fig.5)[49]. In this case, one has two options: running a very long MD simulation (in the 100 s of ns scale) or using other MD simulations methods that are designed to solve this sampling problem such as Replica Exchange MD (REMD) [50], accelerated Molecular Dynamic (aMD) [49, 51], Free Energy Perturbation (FEP), Metadynamics-Based Methods, and Steered Molecular Dynamic (SMD) [21]. These methods can provide significant improvement over conventional MD methods, although they require huge computational resources. A good sampling of the protein structure is important to identify binding sites. If the sampling is not suffi- ciently done to fully explore all possible target conformations, one may miss a very valuable binding site on the target. Many reviews have focused on these different MD methods, and interested read- ers are directed to these references [21, 22, 52–54].

2.6 Cosolvent
Molecular Dynamics
Simulations

Flooding a protein structure with different probes during MD simulations emerged as a new way to detect binding sites while taking into account their flexibility. A promising example of such methods is the cosolvent MD simulation approach. In this method,

94 Tianhua Feng and Khaled Barakat

Computational Drug Discovery and Design

Get our desktop app

Company

Features

Documentation

Resources