data-architecture-a

(coco) #1

created. And suppose you are looking for phone calls relating to terrorism. Out of the
millions and millions of phone calls made, only a handful will relate to activities of
terrorism.


The same phenomenon is true of click stream data, analog data, metering data, and so
forth. There do exist however records that are not directly business-relevant but are
potentially business-relevant. These potentially business-relevant records are records that
are not immediately useful to the business but are potentially useful under other
circumstances.


Now, let's consider the business relevancy of nonrepetitive unstructured data.
Nonrepetitive unstructured data are made up of records such as e-mail, call center data,
conversations, and insurance claims. Fig. 1.4.8 depicts nonrepetitive unstructured data.


Fig. 1.4.8 Business relevancy.

In nonrepetitive unstructured data, there are data such as spam, blather, and stop words.
These types of data are not business-relevant. But much of the data found in the
nonrepetitive unstructured category are business-relevant (or are at least potentially
business-relevant).


Now, let's stop and take a look at the demographics of business relevancy as they relate
to unstructured data (big data). Fig. 1.4.9 shows where business relevancy lies.


Chapter 1.4: Demographics of Corporate Data
Free download pdf