72 CATALYZING INQUIRY
By representing working hypotheses, derived results, and the evidence that supports and refutes
them in machine-readable representations, researchers can uncover correlations in and make inferences
about independently conducted investigations of complex biological systems that would otherwise
remain undiscovered by relying simply on serendipity or their own reasoning and memory capaci-
ties.^32 In principle, software can read and operate on these representations, determining properties in a
way similar to human reasoning, but able to consider hundreds or thousands of elements simulta-
neously. Although automated reasoning can potentially predict the response of a biological system to a
particular stimulus, it is particularly useful for discovering inconsistencies or missing relations in the
data, establishing global properties of networks, discovering predictive relationships between elements,
and inferring or calculating the consequences of given causal relationships.^33 As the number of discov-
ered pathways and molecular networks increases and the questions of interest to researchers become
more about global properties of organisms, automated reasoning will become increasingly useful.
Symbolic representations of biological knowledge—ontologies—are a foundation for such efforts.
Ontologies contain names and relationships of the many objects considered by a theory, such as genes,
enzymes, proteins, transcription, and so forth. By storing such an ontology in a symbolic machine-
TABLE 4.1 Continued
Organization Descriptions
mmCEF (Macromolecular Crystallographic The information file mmCEF is sponsored by IUCr
Information File): (International Union of Crystallography) to provide a
http://pdb.rutgers.edu/mmcif/ dictionary for data items relevant to macromolecular
http://www.iucr.ac.ukliucr-top/cif/index.html crystallographic experiments.
LocusLink: LocusLink contains gene-centered resources, including
http://www.ncbi.nlm.nih.gov/LocusLink/ nomenclature and aliases for genes.
Protégé-2000: Protégé-2000 is a tool that allows the user to construct a
http://protege.stanford.edu domain ontology that can be extended to access embedded
applications in other knowledge-based systems. A number
of biomedical ontologies have been constructed with this
system, but it can be applied to other domains as well.
TAMBIS: TAMBIS aims to aid researchers in the biological sciences
http://imgproj.cs.man.ac.uk/tambis/ by providing a single access point for biological
information sources around the world. The access point will
be a single Web-based interface that acts as a single
information source. It will find appropriate sources of
information for user queries and phrase the user questions
for each source, returning the results in a consistent manner
which will include details of the information source.
(^32) L. Hunter, “Ontologies for Programs, Not People,” Genome Biology 3(6):1002.1-1002.2, 2002.
(^33) As shown in Chapter 5, simulations are also useful for predicting the response of a biological system to various stimuli. But
simulations instantiate procedural knowledge (i.e., how to do something), whereas the automated reasoning systems discussed
here operate on declarative knowledge (i.e., knowledge about something). Simulations are optimized to answer a set of questions
that is narrower than those that can be answered by automated reasoning systems—namely, predictions about the subsequent
response of a system to a given stimulus. Automated reasoning systems can also answer such questions (though more slowly),
but in addition they can answer questions such as, What part of a network is responsible for this particular response?, presuming
that such (declarative) knowledge is available in the database on which the systems operate.