untitled

(ff) #1

5.4 Structural Databases 113


ficient sequence search protocols and reliable thresholds. The database is
available as a collection of flat files using the fixed-width format, as in sec-
tion 1.1.


FSSP http://www.embl-ebi.ac.uk/dali
The FSSP database and its new supplement, the Dali Domain Dictionary,
present a continuously updated classification of all known 3D protein struc-
tures (Holm et al. 1992; Holm and Sander 1998). FSSP stands for thefold
classification based onstructure-structure alignment ofproteins. The classi-
fication is derived using an automatic structure alignment program, called
Dali, for the all-against-all comparison of structures in the PDB. From the re-
sulting enumeration of structural neighbors (which form a surprisingly con-
tinuous distribution in fold space) a discrete fold classification is derived in
three steps: (1) sequence-related families are covered by a representative set
of protein chains; (2) protein chains are decomposed into structural domains
based on the recurrence of structural motifs; and (3) folds are defined as tight
clusters of domains in fold space. The database is available as an SQL dump,
using a fixed-width format.


SCOP, CATH, and FSSP are structure classification databases that define,
classify, and annotate each domain in the PDB. A systematic comparison of
SCOP, CATH, and FSSP found that approximately two thirds of the protein
chains are common to all three databases (Hadley and Jones 1999).


REBASE rebase.neb.com/rebase/rebase.html
REBASE contains information about restriction enzymes, including their re-
cognition specificities and their sensitivity to DNA methylation (Roberts et al.
2003). There are three major categories of restriction enzymes: type I, type
II, and type III. The type II restriction enzymes are among the most valu-
able tools available to researchers in molecular biology. These enzymes rec-
ognize short DNA sequences (four to eight nucleotides) and cleave at, or
close to, their recognition sites (Pingoud and Jeltsch 2001). Type II enzymes
are widely used not only for molecular cloning and genotyping but also
for molecular diagnostics. REBASE contains comprehensive information on
all types of restriction enzymes, as well as related kinds of proteins such
as methyltransferases, homing endonucleases, and related proteins such as
nicking enzymes, specificity subunits of the type I enzymes, control proteins,
and methyl-directed restriction enzymes.
The REBASE database is currently available in 39 formats! This extreme
heterogeneity is due to the large number of tools, each of which requires its
own format. Standard formats would help control this diversity.

Free download pdf