data-architecture-a

(coco) #1

Acronym Resolution


A related form of resolution is that of acronym resolution. Acronyms are found
everywhere in raw text. Acronyms are a standard part of communication. Furthermore,
acronyms tend to be clustered around some subject area. There are IBM acronyms. There
are military acronyms. There are IMS acronyms. There are chemical acronyms. There are
Microsoft acronyms and so forth.


In order to clearly understand a communication, it is advisable to resolve acronyms.


Textual ETL is equipped to resolve acronyms. When textual ETL reads raw text and
spots an acronym, textual ETL replaces the acronym with the literal value.


Fig. 10.1.10 shows the dynamics of how textual ETL reads raw text and resolves an
acronym when it is found.


Fig. 10.1.10 Processing an acronym.

As an example of how acronym resolution works, suppose there was the following text:


Sgt Mullaney was AWOL as of 10:30 p.m. on Dec 25...


Chapter 10.1: Nonrepetitive Data
Free download pdf