data-architecture-a

(coco) #1
Fig. 10.3.3 Before text can be processed it must be mapped.

Once mapping is done, textual disambiguation is ready to process the transcriptions. The
input to textual disambiguation is the raw text, the mapping, and taxonomies. The output
from textual disambiguation is an analytic database. The analytic database is in the form
of any standard database that is used for analytic processing. By the time the analyst gets
his/her hands on the database, it appears to be just like any other database the analyst has
ever processed. The only difference is that the source of data for this database is
nonrepetitive text.


Fig. 10.3.4 shows the processing that occurs inside textual disambiguation.


Fig. 10.3.4 Transforming text into a data base.

The output of textual disambiguation is a standard database, often thought of as being in
the form of relational data. In many ways, the database that has been produced has text
that has been “normalized.” There are business relationships that are buried in the
database. These business relationships are a result of the mapping and the text that has
been interpreted by the mapping.


Fig. 10.3.5 shows the database that has been produced.


Chapter 10.3: Analytics From Nonrepetitive Data
Free download pdf