COMPUTATIONAL TOOLS 79
schemes).^56 Finally, because individual experiments can study only a few aspects of a brain region at
one time, a standard coordinate system allows the same brain region to be sampled repeatedly to allow
data to be accumulated over time.
4.2.11 A Case Study: Ecological and Evolutionary Databases,
Although genomic databases such as GenBank receive the majority of attention, databases and
algorithms that operate on databases are key tools in research into ecology and biodiversity as well.
These tools can provide researchers with access to information regarding all identified species of a given
type, such as AlgaeBase^57 or FishBase;^58 they also serve as a repository for submission of new informa-
tion and research. Other databases go beyond species listings to record individuals: for example, the
ORNIS database of birds seeks to provide access to nearly 5 million individual specimens held in
natural history collections, which includes data such as recordings of vocalizations and egg and nest
holdings.^59
The data associated with ecological research are gathered from a wide variety of sources: physical
observations in the wild by both amateurs and professionals; fossils; natural history collections; zoos,
botanical gardens, and other living collections; laboratories; and so forth. In addition, these data must
placed into contexts of time, geographic location, environment, current and historical weather and
climate, and local, regional, and global human activity. Needless to say, these data sources are scattered
throughout many hundreds or thousands of different locations and formats, even when they are in
digitally accessible format. However, the need for integrated ecological databases is great: only by being
able to integrate the totality of observations of population and environment can certain key questions be
answered. Such a facility is central to endangered species preservation, invasive species monitoring,
wildlife disease monitoring and intervention, agricultural planning, and fisheries management, in addi-
tion to fundamental questions of ecological science.
The first challenge in building such a facility is to make the individual datasets accessible by
networked query. Over the years, hundreds of millions of specimens have been recorded in museum
records. In many cases, however, the data are not even entered into a computer; they may be stored as
a set of index cards dating from the 1800s. Natural history collections, such as a museum’s collection of
fossils, may not even be indexed, and they are available to researchers only by physically inspecting the
drawers. Very few specimens have been geocoded.
Museum records carry a wealth of image and text data, and digitizing these records in a mean-
ingful and useful way remains a serious challenge. For this reason, funding agencies such as the
National Science Foundation (NSF) are emphasizing integrating database creation, curation, and
sharing into the process of ecological science: for example, the NSF Biological Databases and
Informatics program^60 (which includes research into database algorithms and structures, as well as
developing particular databases) and the Biological Research Collections program, which provides
around $6 million per year for computerizing existing biological data. Similarly, the NSF Partner-
ships for Enhancing Expertise in Taxonomy (PEET) program,^61 which emphasizes training in tax-
onomy, requires that recipients of funding incorporate collected data into databases or other shared
electronic formats.
(^56) D.C. Van Essen, “Windows on the Brain: The Emerging Role of Atlases and Databases in Neuroscience,” Current Opinion in
Neurobiology 12:574-579, 2002.
(^57) See http://www.algaebase.org.
(^58) See http://www.fishbase.org.
(^59) See http://www.ornisnet.org.
(^60) NSF Program Announcement NSF 02-058; see http://www.nsf.gov/pubsys/ods/getpub.cfm?nsf02058.
(^61) See http://web.nhm.ku.edu/peet/.