untitled

(ff) #1

90 5 Survey of Ontologies in Bioinformatics


ogy. The first one was originally focused on medical terminology but now
also includes many other biomedical vocabularies, has grown to be impres-
sively large, but is sometimes incoherent as a result. The second ontology
focuses exclusively on terminology for genomics. As a result of its narrow
focus, it is very coherent, and a wide variety of tools have been developed
that make use of it. Finally, we consider ontologies that organize other on-
tologies. The number of biomedical ontologies and databases has grown so
large that it is necessary to have a framework for organizing them.

5.1.1 Unified Medical Language System


Terminology is the most common denominator of all biomedical literature
resources, including the names of organisms, tissues, cell types, genes, pro-
teins, diseases. There are various controlled vocabularies such as the Medical
Subject Headings (MeSH) associated with these resources. MeSH was de-
veloped by the U.S. National Library of Medicine (NLM). However, having
identified terminology as a key integrating factor for biomedical resources
does not imply they use standard vocabularies which would make these re-
sources interoperable. In 1986, NLM began a long-term research and de-
velopment project to build the Unified Medical Language System (UMLS)
located atwww.nlm.nih.gov/research/umls. The UMLS is a repository
of biomedical vocabularies and is the NLM’s biological ontology (Lindberg
et al. 1993; Baclawski et al. 2000; Yandell and Majoros 2002).
The purpose of the UMLS is to improve the ability of computer programs
to “understand” the biomedical meaning in user inquiries and to use this un-
derstanding to retrieve and integrate relevant machine-readable information
for users (Lindberg et al. 1993). The UMLS integrates over 4.5 million names
for over 1 million concepts from more than 100 biomedical vocabularies, as
well as more than 12 million relations among these concepts. Vocabularies
integrated in the UMLS include the the taxonomy of the National Center for
Biotechnology Information (NCBI), the Gene Ontology (GO), MeSH and the
digital anatomist symbolic knowledge base. UMLS concepts are not only
interrelated, but may also be linked to external resources such as GenBank
(Bodenreider 2004).
The UMLS is composed of three main components: the Metathesaurus
(META), the SPECIALIST lexicon and associated lexical programs, and the
Semantic Network (SN) (Denny et al. 2003). We now discuss each of these
components in more detail.
META is the main component of the UMLS. This component is a repository
Free download pdf