“parsing” implies a straightforward process, and the logic that occurs here is anything but
straightforward. The remainder of this chapter discusses the logic that occurs here.
After the nonrepetitive data have been “parsed,” the attributes of data, the keys of data,
and the records of data are identified.
Once the keys, attributes, and records are identified, it is a straightforward process to turn
the data into a standard database record.
That then is what takes place in textual disambiguation.
The heart of textual disambiguation is the logic of processing that occurs when
nonrepetitive data are analyzed and turned into keys, attributes, and records.
The activities of logic that occur here can be roughly classified into several categories.
Fig. 10.1.5 shows those categories.
Fig. 10.1.5 The different types of textual disambiguation.
The basic activities of logic applied by textual disambiguation include the activities of the
following:
Contextualization, where the context of data is identified and captured
Standardization, where certain types of text are standardized
Basic editing, where basic editing of text occurs
Indeed, there are other functions of textual disambiguation, but these three classifications
of activities encompass most of the important processing that occurs.
Chapter 10.1: Nonrepetitive Data