Catalyzing Inquiry at the Interface of Computing and Biology

(nextflipdebug5) #1
64 CATALYZING INQUIRY

calls for the use of object-oriented concepts to develop data definitions, encapsulating the internal
details of the data associated with the heterogeneity of the underlying data sources.^8 A change in the
representation or definition of the data then has minimal impact on the applications that access those
data.
An example of a data federation environment is BioMOBY, which is based on two ideas.^9 The first
is the notion that databases provide bioinformatics services that can be defined by their inputs and
outputs. (For example, BLAST is a service provided by GenBank that can be defined by its input—that
is, an uncharacterized sequence—and by its output, namely, described gene sequences deposited in
GenBank.) The second idea is that all database services would be linked to a central registry (MOBY
Central) of services that users (or their applications) would query. From MOBY Central, a user could
move from one set of input-output services to the next—for example, moving from one database that,
given a sequence (the input), postulates the identity of a gene (the output), and from there to a database
that, given a gene (the input), will find the same gene in multiple organisms (the output), and so on,
picking up information as it moves through database services. There are limitations to the BioMOBY
system’s ability to discriminate database services based the descriptions of inputs and outputs, and
MOBY Central must be up and running 24 hours a day.^10


Box 4.2 Continued

The development of similar atlases to evaluate patients with well-defined disease states allows the opportunity
to compare the normal brain with brains of patients having cerebral pathological conditions, thereby poten-
tially leading to enhanced clinical trials, automated diagnoses, and other clinical applications. Such examples
have already emerged in patients with multiple sclerosis and epilepsy. An example in Alzheimer’s disease
relates to a current hotly contested research question. Individuals with Alzheimer’s disease have a greater
likelihood of having the genotype ApoE 4 (as opposed to ApoE 2 or 3). Having this genotype, however, is
neither sufficient nor required for the development of Alzheimer’s disease. Individuals with Alzheimer’s dis-
ease also have small hippocampi, presumably because of atrophy of this structure as the disease progresses.
The question of interest is whether individuals with the high-risk genotype (ApoE 4) have small hippocampi to
begin with. This would be a very difficult hypothesis to test without the dataset described above. With the
ICBM database, it is possible to study individuals from, for example, ages 20 to 40 and identify those with the
smallest (lowest 5 percent) and largest (highest 5 percent) hippocampal volumes. This relatively small number
of subjects could then be genotyped for ApoE alleles. If individuals with small hippocampi all had the geno-
type ApoE 4 and those with large hippocampi all had the genotype ApoE 2 or 3, this would be strong support
for the hypothesis that individuals with the high-risk genotype for the development of Alzheimer’s disease
have small hippocampi based on genetic criteria as a prelude to the development of Alzheimer’s disease.
Similar genotype-imaging phenotype evaluations could be undertaken across a wide range of human condi-
tions, genotypes, and brain structures.

SOURCE: Modified from John C. Mazziotta and Arthur W. Toga, Department of Neurology, David Geffen School of Medicine, University
of California, Los Angeles, personal communication to John Wooley, February 22, 2004.

(^8) R.G.G. Cattell, Object Data Management: Object-Oriented and Extended Relational Database Systems, revised edition, Addison-
Wiley, Reading, MA, 1994. (Cited in Chung and Wooley, 2003.)
(^9) M.D. Wilkinson and M. Links, “BioMOBY: An Open-Source Biological Web Services Proposal,” Briefings In Bioinformatics
3(4):331-341, 2002.
(^10) L.D. Stein, “Integrating Biological Databases,” Nature Reviews Genetics 4(5):337-345, 2003.

Free download pdf