data-architecture-a

(coco) #1

when the text is matched against a taxonomy.


Following taxonomic analysis, there came NLP—natural language processing. Natural
language processing took all the previous techniques and built on them in order to
produce an effective way to examine and analyze text.


In the final phase of the evolution, there is textual ETL (or textual disambiguation).
Textual ETL does everything that NLP does and adds a lot of other functionality. The
emphasis of textual ETL is on the identification of the context of text, not the text itself.
In addition, textual ETL specifically builds databases. And textual ETL also does in-line
contextualization.


Today, with textual ETL, you can read text and turn it into useful databases. Once you
have constructed the databases, you can then use standard visualization tools to analyze
the data.


The Challenge of Context


The first and biggest problem with trying to incorporate text into a database environment
is that text does not fit comfortably inside a database. But that is not the only problem.
The second major problem is that in order to deal with text, you have to deal with context
as well. Stated differently, dealing with text is one problem. Dealing with the context of
text is an entirely different problem. But in order to put text meaningfully into an
environment where it can be analyzed, you MUST deal with both text and context.


Fig. 17.1.3 shows that text and context must be considered.


Chapter 17.1: Managing Text
Free download pdf