12 Building Bioinformatics Ontologies
Unstructured data, such as natural language text, and semistructured data,
such as tables and graphs, are adequate mechanisms for individuals to com-
municate with one another using traditional print media and when the a-
mount of published material is relatively small. However, the amount of
biomedical knowledge is becoming much too large for traditional approaches.
While printed research publications are still very important, other forms of
biomedical information are now being published electronically. Formal on-
tologies can increase the likelihood that such published information will be
found and used, by making the data easier to query and transform. Given
this situation, it is not surprising to learn that ontologies for biology and
medicine are proliferating. Unfortunately, as we have seen in chapters 2 and
4, there are a many web-based ontology languages. Furthermore, even if one
has selected an ontology language, there are many ways to build an ontology.
This chapter discusses how to deal with the diversity of ontology languages
and how to build high-quality ontologies.
However, before beginning to develop an ontology, one should examine
the purpose and motivation for embarking on this activity. The first section
is concerned with the questions that should be answered in this regard. Once
one has a clear understanding of the purpose of the ontology, there are four
major activities that must be undertaken: choosing an ontology language,
obtaining a development tool, acquiring domain knowledge, and reusing
existing ontologies. These activities are explained in a series of sections de-
voted to each of the topics. Although the topics are presented in a particular
order, they do not have to be undertaken in that order, and may even be
performed in parallel.
Having explained the major activities required for ontology development,
the chapter turns to the issue of how to ensure that the ontology being de-