Chapter 10.3
Analytics From Nonrepetitive Data
Abstract
Nonrepetitive analytics begins with the contextualization of the nonrepetitive data.
Unlike repetitive data, the context of nonrepetitive data is difficult to determine. The
context of nonrepetitive big data is determined by textual disambiguation. In textual
disambiguation, there are algorithms that relate to stop word resolution, stemming,
homographic resolution, in-line contextualization, taxonomy/ontology resolution, custom
variable resolution, acronym resolution, and so forth. Nonrepetitive analytics is very
relevant to business value. Some typical forms of nonrepetitive analytics include the
analysis of medical records, warranty analysis, insurance claim analysis, and call center
analysis.
Keywords
Nonrepetitive data; Textual disambiguation; Stemming; Stop word processing;
Homographic resolution; Taxonomic resolution; Custom variable resolution; Acronym
resolution; Inline contextualization
There is a wealth of information hidden in nonrepetitive data that is unable to be
analyzed by traditional means. Only after the nonrepetitive data have been unlocked by
textual disambiguation can analysis be done.
There are many examples of rich environments where there is a wealth of information in
nonrepetitive data, such as the following:
E-mail
Call center
Corporate contracts
Warranty claims
Insurance claims
Medical records
But talking about the value of analysis of nonrepetitive data and actually showing the
value are two different things. The world is not convinced until it sees concrete examples.
Chapter 10.3: Analytics From Nonrepetitive Data