1.3 TURNING QUALITATIVE TEXT INTO QUANTIFIED
METRICS AND TIME-SERIES
A salient aspect of news analysis is to discover theinformational content of news.
Converting qualitative text into a machine-readable form is a challenging task. We
may wish to distinguish whether a story’s informational content is positive or negative;
that is, determine its sentiment. We may go further and try to identify ‘‘by how much’’
the story is positive or negative. In doing this we may try to assign a quantified sentiment
score or index to each story. A major difficulty in this process is identifying the context in
which a story’s language is to be judged. Sentiment may be defined in terms of how
positively or negatively a human (or group of humans) interprets a story; that is, the
emotive content of the story for that human. In particular, standards can be defined
using experts to classify stories. Some of RavenPack’s classifiers are calibrated using
language training sets developed by finance experts. Further, dictionary-based algo-
rithms which use psychology-based interpretations of words may be used. Since different
groups of people are affected by events differently and have different interpretations of
the same events, conflicts may arise. Moniz, Brar, and Davis (2009) gives an example of
the term ‘‘dividend cuts’’. This may be classified as a negative term by a dictionary-based
algorithm. In contrast, it may be interpreted positively by market analysts who may
believe this indicates the company is saving money and is better positioned to repay its
debts. Loughran and McDonald (forthcoming) also consider how context affects inter-
pretation of the tone of text. They note a psychological dictionary like the Harvard-IV-4
may classify words as negative when they do not have a negative financial meaning.
They develop an alternative negative word list that better reflects the tone of financial
text.
10 The Handbook of News Analytics in Finance
Figure 1.5.Seasonality—intraweek pattern.