data-architecture-a

(coco) #1
Fig. 10.1.7 Processing a taxonomy against raw text.

As a simple example of the application of a taxonomy to raw text, consider the following
example.


Raw text—“...she drove her Honda into the garage....” The simple taxonomy used looks
like the following:


Car
Porsche
Honda
Toyota
Ford
Kia
Volkswagen

When the taxonomy is passed against the raw text, the results look like the following:


Document name, byte, context—car, value—Honda

In order to accommodate other processing, on some occasions, it is useful to create a
second entry:


Document name, byte, context—car, value—car

The reason why it is sometimes useful to produce a second entry into the analytic
database is that on occasion, you want to process all the values and you want the context
to be processed as a value. That is why that on occasion, the system produces two entries
into the analytic database.


Chapter 10.1: Nonrepetitive Data
Free download pdf