data-architecture-a

(coco) #1
Fig. 1.3.6 An infinite amount of data.

There is then an emphasis on doing the normal tasks of data management in the Hadoop
environment where the process must be able to handle very large amounts of data.


Nonrepetitive Unstructured Data


The emphasis in the nonrepetitive unstructured environment is quite different than the
emphasis on the management of the Hadoop big data technology. In the nonrepetitive
unstructured environment, there is an emphasis on “textual disambiguation” (or on
“textual ETL”). This emphasis is shown in Fig. 1.3.7.


Chapter 1.3: The “Great Divide”
Free download pdf