ON THE NATURE OF BIOLOGICAL DATA 43
The size of biological objects is often not constant. More importantly, relational databases presume the
existence of well-defined and known relationships between data records, whereas the reality of biologi-
cal research is that relationships are imprecisely known—and this imprecision cannot be reduced to
probabilistic measures of relationship that relational databases can handle.
Jagadish and Olken argue that without specialized life sciences enhancements, commercial rela-
tional database technology is cumbersome for constructing and managing biological databases, and
most approximate sequence matching, graph queries on biopathways, and three-dimensional shape
similarity queries have been performed outside of relational data management systems. Moreover, the
relational data model is an inadequate abstraction for representing many kinds of biological data (e.g.,
pedigrees, taxonomies, maps, metabolic networks, food chains). Box 3.1 provides an illustration of how
business database technology can be inadequate.
Object-oriented databases have some advantages over relational databases since the natural foci of
study are in fact biological objects. Yet Jagadish and Olken note that object-oriented databases have also
had limited success in providing efficient or extensible declarative query languages as required for
specialized biological applications.
Because commercial database technology is of limited help, research and development of database
technology that serves biological needs will be necessary. Jagadish and Olken provide a view of require-
ments that will necessitate further advances in data management technology, requirements that include
Pharmacogenomics, PharmGKB (Pharmacogenetics Knowledge Base):
pharmaco genetics, single http://pharmgkb.org
nucleotide polymorphism
(SNP), genotyping SNP Consortium: http://snp.cshl.org
dbSNP (Single Nucleotide Polymorphism Database):
http://www.ncbi.nlm.nih.gov/SNP/
LocusLink: http://www.ncbi.nlm.nih.gov/LocusLink
AFRED (Allele Frequency Database):
http://alfred.med.yale. edu/alfred/index.asp
CEPH Genotype Database: http://www.cephb.fr/cephdb/
Tissues, organs, and Visible Human Project Database:
organisms http://www.nlm.nih.gov/research/visible/visible-human.html
BRAID (Brain Image Database): http://Hbraid.rad.jhu.edu/interface.html
NeuroDB (Neuroscience Federated Database):
http://www.npaci.edu/DICE/Neuro/
The Whole Brain Atlas:
http://www.med.harvard.edu/AANLIB/home.html
Literature reference PubMed MEDLINE:
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi
USPTO (U.S. Patent and Trademark Office): http://www.uspto.gov/
TABLE 3.1 Continued
Category Databases and URLs