Catalyzing Inquiry at the Interface of Computing and Biology

104 CATALYZING INQUIRY

Protein structures predicted in high resolution can help characterize the biological functions of
proteins. Biotechnology companies are hoping to accelerate their efforts to discover new drugs that
interact with proteins by using structure-based drug design technologies. By combining computational
and combinatorial chemistry, researchers expect to find more viable leads. Algorithms create molecular
structure built de novo to optimize interactions within the protein’s active sites. The use of so-called
virtual screening in combination with studies of co-crystallized drugs and proteins could be a powerful
tool for drug development.
A number of tools for protein structure prediction have been developed, and progress in prediction
by these methods has been evaluated by the Critical Assessment of Protein Structure Prediction (CASP)
experiment held every two years since 1994.^137 In a CASP experiment, the amino acid sequences of
proteins whose experimentally determined structures have not yet been released are published, and
computational research groups are then invited to predict structures of these target sequences using
their methods and any other publicly available information (e.g., known structures that exist in the
Protein Data Bank (PDB), the data repository for protein structures). The methods used by the groups

TABLE 4.3 Algorithms, Databases, Analytical Systems, and Scientific Research Enabled by the PIR Resource

Resource Topic Reference

Algorithm Benchmarking for sequence similarity search Pearson, J. Mol. Biol. 276:71-84, statistics 1998 PANDORA keyword-based analysis of proteins Kaplan, Nucleic Acids Research 31:5617-5626, 2003 Computing motif correlations for structure Horng et al., J. Comp. Chem. prediction 24(16):2032-2043, 2003 Database NESbase database of nuclear export signals la Cour et al., Nucleic Acids Research 31(l):393-396, 2003 TMPDB database of transmembrane topologies Ikeda et al., Nucleic Acids Research 31:406-409, 2003 SDAP database and tools for allergenic proteins Ivanciuc et al., Nucleic Acids Research 31:359-362, 2003 System SPINE 2 system for collaborative structural Goh et al., Nucleic Acids proteomics Research 31:2833-2838, 2003 ERGOTM genome analysis and discovery system Overbeek et al., Nucleic Acids Research 31(l):164-171, 2003 Automated annotation pipeline and cDNA Kasukawa et al., Genome Res. annotation system 13(6B):1542-1551, 2003 Systers, GeneNest, SpliceNest from genome to Krause et al., Nucleic Acids protein Research 30(l):299-300, 2002 Research Intermediate filament proteins during Prasad et al., Int. J. Oncol. carcinogenesis or apoptosis 14(3):563-570, 1999 Conserved pathway by global protein network Kelley et al., PNAS alignment 100(20):11394-11399, 2003 Membrane targeting of phospholipase C Singh and Murray, Protein Sci. pleckstrin 12:1934-1953, 2003 Analysis of human and mouse cDNA sequences Strausberg et al., PNAS 99(26):16899-16903, 2002 A novel Schistosoma mansoni G protein-coupled Hamdan et al., Mol. Biochem. receptor Parasitol. 119(l):75-86, 2002 Proteomics reveals open reading frames (ORFs) Jungblut et al., Infect. Immunol. in Mycobacterium tuberculosis 69(9):5905-5907, 2001

(^137) See http://predictioncenter.llnl.gov/.

Catalyzing Inquiry at the Interface of Computing and Biology

104 CATALYZING INQUIRY

Get our desktop app

Company

Features

Documentation

Resources