104 CATALYZING INQUIRY
Protein structures predicted in high resolution can help characterize the biological functions of
proteins. Biotechnology companies are hoping to accelerate their efforts to discover new drugs that
interact with proteins by using structure-based drug design technologies. By combining computational
and combinatorial chemistry, researchers expect to find more viable leads. Algorithms create molecular
structure built de novo to optimize interactions within the protein’s active sites. The use of so-called
virtual screening in combination with studies of co-crystallized drugs and proteins could be a powerful
tool for drug development.
A number of tools for protein structure prediction have been developed, and progress in prediction
by these methods has been evaluated by the Critical Assessment of Protein Structure Prediction (CASP)
experiment held every two years since 1994.^137 In a CASP experiment, the amino acid sequences of
proteins whose experimentally determined structures have not yet been released are published, and
computational research groups are then invited to predict structures of these target sequences using
their methods and any other publicly available information (e.g., known structures that exist in the
Protein Data Bank (PDB), the data repository for protein structures). The methods used by the groups
TABLE 4.3 Algorithms, Databases, Analytical Systems, and Scientific Research Enabled by the PIR
Resource
Resource Topic Reference
Algorithm Benchmarking for sequence similarity search Pearson, J. Mol. Biol. 276:71-84,
statistics 1998
PANDORA keyword-based analysis of proteins Kaplan, Nucleic Acids Research
31:5617-5626, 2003
Computing motif correlations for structure Horng et al., J. Comp. Chem.
prediction 24(16):2032-2043, 2003
Database NESbase database of nuclear export signals la Cour et al., Nucleic Acids
Research 31(l):393-396, 2003
TMPDB database of transmembrane topologies Ikeda et al., Nucleic Acids
Research 31:406-409, 2003
SDAP database and tools for allergenic proteins Ivanciuc et al., Nucleic Acids
Research 31:359-362, 2003
System SPINE 2 system for collaborative structural Goh et al., Nucleic Acids
proteomics Research 31:2833-2838, 2003
ERGOTM genome analysis and discovery system Overbeek et al., Nucleic Acids
Research 31(l):164-171, 2003
Automated annotation pipeline and cDNA Kasukawa et al., Genome Res.
annotation system 13(6B):1542-1551, 2003
Systers, GeneNest, SpliceNest from genome to Krause et al., Nucleic Acids
protein Research 30(l):299-300, 2002
Research Intermediate filament proteins during Prasad et al., Int. J. Oncol.
carcinogenesis or apoptosis 14(3):563-570, 1999
Conserved pathway by global protein network Kelley et al., PNAS
alignment 100(20):11394-11399, 2003
Membrane targeting of phospholipase C Singh and Murray, Protein Sci.
pleckstrin 12:1934-1953, 2003
Analysis of human and mouse cDNA sequences Strausberg et al., PNAS
99(26):16899-16903, 2002
A novel Schistosoma mansoni G protein-coupled Hamdan et al., Mol. Biochem.
receptor Parasitol. 119(l):75-86, 2002
Proteomics reveals open reading frames (ORFs) Jungblut et al., Infect. Immunol.
in Mycobacterium tuberculosis 69(9):5905-5907, 2001
(^137) See http://predictioncenter.llnl.gov/.