untitled

(ff) #1

6.6 Retrieval of Knowledge Representations 149


times the retrieval will be useless. These engines use a variety of mechanisms
for overcoming this limitation, but they can never completely eliminate it.
This is in striking contrast with relational database queries which always
return all of the items specified and no others. Using the terminology of
information retrieval, relational queries always have 100% coverage and se-
lectivity. It is natural to imagine that one could try to achieve the same cover-
age and selectivity with information retrieval. To do so one must overcome
several difficult problems:



  1. The meaning of natural language text is complex and difficult to represent
    in a manner necessary for retrieval using database query languages.

  2. Even if one could develop a representation language for natural language,
    it is very difficult to translate from natural language into the representa-
    tion language.


The first problem above is addressed by ontologies. While we do not yet
have ontologies that are sufficiently deep and have enough coverage for the
biomedical domain, the ontologies are improving steadily. Even the incom-
plete and relatively shallow ontologies that are available today (such as the
UMLS) are recognized as important and useful resources for biomedical re-
search.
The second problem is in many ways the more problematic one. Natural
language text is not easily understood by computers. The process whereby
natural language is translated from text to the representation language is
called natural language processing (NLP). The result of applying NLP to a
document is called theknowledge representationof the document. In a know-
ledge representation, all terms are expressed unambiguously as instances of
classes in the ontology, that is, they refer to the corresponding concept in the
ontology. Relationships between terms are also expressed unambiguously
using the relationships in the ontology. To see what a knowledge representa-
tion looks like, see any of the XML documents shown in chapter 1, especially
the ones in section 1.6.
Once natural language text has been converted to a knowledge represen-
tation, one can infer additional facts that were not explicitly stated. These
inferences take advantage of the ontology which can have many rules that
allow such inferencing. For example, if one speaks of a patient that has leu-
kemia, one can infer that this individual is a human who has cancer.
NLP techniques are not yet sufficiently well developed to be able to pro-
duce knowledge representations that always exactly represent the meaning

Free download pdf