data-architecture-a

(coco) #1
processing.

By looking at the histogram, the management has a very good idea what subjects are on
the mind of their customer base.


Looking at the dashboard tells management in a glance what management needs to know
about what is going on in the call center.


As impressive as the dashboard is, the dashboard would not be possible without the data
being placed in a standard database.


There is a progression of processing and data that makes possible the creation of the
dashboard. That progression looks like the following:


Repetitive data → mapping → textual ETL → standard database → analytic tool → dashboard

Medical Records


Call center records are important and are at the center of business value. But call center
records are hardly the only form of nonrepetitive records that are valuable. Another form
of valuable nonrepetitive data is medical records. Medical records are written usually as a
patient goes through a procedure or some event of medical care. The records—once
written—are valuable to many people and organizations, to the physician, to the patient,
to the hospital or provider, to research organizations, and more.


The challenge with medical records is that they contain narrative information. Narrative
information is necessary and useful to the physician. But narrative information is not
useful to the computer. In order to be used in analytic processing, the narrative
information must be put into the form of a database in a standard database format.


This is a classical case of nonrepetitive data being placed in the form of a database. What
is needed is textual ETL.


In order to see how textual ETL is used, consider a medical record. (NOTE: the medical
record being shown is a real record. However, it is from a country other than the United
States and is not subject to the regulations of HIPAA.)


When looking at medical records, the records start to take a recognizable pattern. The
first part of the medical record is the identification part. In this part of the record, one or


Chapter 10.3: Analytics From Nonrepetitive Data
Free download pdf