COMPUTATIONAL TOOLS 103
4.4.9 Determination of Three-dimensional Protein Structure,
One central problem of proteomics is that of protein folding. Protein folding is one of the most
important cellular processes because it produces the final conformation required for a protein to attain
biological activity. Diseases such as Alzheimer’s disease or bovine spongiform encephalopathy (BSE, or
“Mad Cow” disease) are associated with the improper folding of proteins. For example, in BSE the
protein (called the scrapie prion), which is soluble when it folds properly, becomes insoluble when one
of the intermediates along its folding pathway misfolds and forms an aggregation that damages nerve
cells.^136
Due to the importance of the functional conformation of proteins, many efforts have been at-
tempted to predict computationally a three-dimensional structure of a protein from its amino acid
sequence. Although experimental determination of protein structure based on X-ray crystallography
and nuclear magnetic resonance yields protein structures in high resolution, it is slow, labor-intensive,
and expensive and thus not appropriate for large-scale determination. Also, it can apply only to al-
ready-synthesized or isolated proteins, while an algorithm could be used to predict the structure of a
great number of potential proteins.
TABLE 4.2 Protein Family Classification and Integrative Associative Analysis for Functional
Annotation
Superfamily Classification Description
A. Functional inference of uncharacterized hypothetical proteins
SF034452 TIM-barrel signal transduction protein
SF004961 Metal-dependent hydrolase
SF005928 Nucleotidyltransferase
SF005933 ATPase with chaperone activity
and inactive LON protease domain
SF005211 alpha/beta hydrolase
SF014673 Lipid carrier protein
SF005019 [Ni,Fe]-Hydrogenase-3-type complex,
membrane protein EhaA
B. Correction or improvement of genome annotations
SF025624 Ligand-binding protein with an ACT domain
SF005003 Inactive homologue of metal-dependent
protease
SF000378 Glycyl radical cofactor protein YfiD
SF000876 Chemotaxis response regulator
methylesterase CheB
SF000881 Thioesterase, type II
SF002845 Bifunctional tetrapyrrole methylase and
MazG NTPase
C. Enhanced understanding of structure, function, evolutionary relationships
SF005965 Chorismate mutase, AroH class
SF001501 Chorismate mutase, AroQ class,
prokaryotic type
NOTE: PIRSF protein family reports detail supporting evidence for both experimentally validated and computationally pre-
dicted annotations.
(^136) See, for example, C.M. Dobson, “Protein Misfolding, Evolution and Disease,” Trends in Biochemical Science 24(9):329-332,
1999; C.M. Dobson, “Protein Folding and Its Links with Human Disease.” Biochemical Society Symposia 68:1-26, 2001; C.M. Dob-
son, “Protein Folding and Misfolding,” Nature 426(6968):884-890, 2003.