data-architecture-a

(coco) #1
Fig. 4.6.1 Transformation of text into a standard database.

Once raw text is transformed, it arrives in the analytic database in a normalized form. The
analytic database looks like any other analytic database. Typically, the analytic data are
“normalized,” where there is a unique key with dependent elements of data. The analytic
database can be joined with other analytic databases to achieve the effect of being able to
analyze structured data and unstructured data in the same query.


Each element in the analytic database can be tied back directly to the originating source
document. This feature is needed if there ever is any question to the accuracy of the
processing that has occurred in textual disambiguation. In addition, if there ever is any
question as to the context of the data found in the analytic database, it can be easily and
quickly verified.


Note that the originating source document is not touched or altered in any way.


Fig. 4.6.2 shows that each element of data in the analytic database can be tied back to
originating source.


Chapter 4.6: Textual Disambiguation
Free download pdf