136 CATALYZING INQUIRY
5.4.1.3 Molecular Docking,
Using a simple, uniform representation of molecular surfaces that requires minimal parameteriza-
tion, Jain^39 has constructed functions that are effective for scoring protein-ligand interactions, quantita-
tively comparing small molecules, and making comparisons of proteins in a manner that does not
depend on protein backbone. These methods rely on computational approaches that are rooted in
understanding the physics of molecular interactions, but whose functional forms do not resemble those
used in physics-based approaches. That is, this problem can be treated as a pure computer science
problem that can be solved using combinations of scoring and search or optimization techniques pa-
rameterized with the use of domain knowledge. The approach is as follows:
- Molecules are approximated as collections of spheres with fixed radii: H = 1.2; C = 1.6; N = 1.5; O =
1.4; S = 1.95; P = 1.9; F = 1.35; Cl = 1.8; Br = 1.95; I = 2.15.
•A labeling of the features of polar atoms is superimposed on the molecular representation:
polarity, charge, and directional preference (Figure 5.2, subfigures A and B).
•A scoring function is derived that, given a protein and a ligand in some relative alignment, yields
a prediction of the energy of interaction. - The function is parameterized in terms of the pairwise distances between molecular surfaces.
- The dominant terms are a hydrophobic term that characterizes interactions between nonpolar
atoms and a polar term that captures complementary polar contacts with proper directionality. - The parameters of the function were derived from empirical binding data and 34 protein-ligand
complexes that were experimentally determined. - The scoring function is described in Figure 5.2, Subfigure C. The hydrophobic term peaks at
approximately 0.1 unit with a slight surface interpenetration. The hydrophobic term for an ideal hydro-
gen bond peaks at 1.25 units, and a charged interaction (tertiary amine proton (+1.0) to a charged
carboxylate (–0.5)) peaks at about 2.3 units. Note that this scoring function looks nothing like a force
field derived from molecular mechanics. - Figure 5.2, Subfigure D compares eight docking methods on screening efficiency using thymi-
dine kinase as a docking target. For the test, 10 known ligands and 990 random ligands were used.
Particularly at low false-positive rates (low database coverage), the scoring function approach shows
substantial improvements over the other methods.
5.4.1.4 Computational Analysis and Recognition of Functional and
Structural Sites in Protein Structures^40
Structural genomics initiatives are producing a great increase in protein three-dimensional struc-
tures determined by X-ray and nuclear magnetic resonance technologies as well as those predicted by
computational methods. A critical next step is to study the relationships between protein structures and
functions. Studying structures individually entails the danger of identifying idiosyncratic rather than
conserved features and the risk of missing important relationships that would be revealed by statisti-
(^39) See A.N. Jain, “Scoring Noncovalent Protein Ligand Interactions: A Continuous Differentiable Function Tuned to Compute
Binding Affinities,” Journal of Computer-Aided Molecular Design 10(5):427-440, 1996; W. Welch, J. Ruppert, and A.N. Jain, “Ham-
merhead: Fast, Fully Automated Docking of Flexible Ligands to Protein Binding Sites,” Chemistry & Biology 3(6):449-462, 1996; J.
Ruppert, W. Welch, and A.N. Jain, “Automatic Identification and Representation of Protein Binding Sites for Molecular Dock-
ing,” Protein Science 6(3):524-533, 1997; A.N. Jain, “Surflex: Fully Automatic Flexible Molecular Docking Using a Molecular
Similarity-based Search Engine,” Journal of Medicinal Chemistry 46(4):499-511, 2003; A.N. Jain, “Ligand-Based Structural Hypoth-
eses for Virtual Screening.” Journal of Medicinal Chemistry 47(4):947-961, 2004.
(^40) Section 5.4.1.4 is based on material provided by Liping Wei, Nexus Genomics, Inc., and Russ Altman, Stanford University,
personal communication, December 4, 2003.