The Wiley Finance Series : Handbook of News Analytics in Finance

(Chris Devlin) #1

considered.^2 Moreover, given the schedule of CRSP updates, the latest date of any data
point in our universe (related to events or performance) is December 31, 2008.
The atomic measure of sentiment to be used is the DJNAMCQ ranking. If a news
story mentions a company in a negative light, then the company receives a negative
MCQ ranking; if a news story mentions a company in a positive light, then the company
receives a positive MCQ ranking. We make this notion slightly more precise below.
LetNdenote the universe of all news stories from the DJNA archive. Fix a company
Cthat is mentioned within some news story fromNand has an MCQ ranking.


Definition 1.Call the partial binary function SC:N !f1;1g the sentiment
indicator relative to Cwhere


SCðNÞ¼

1ifCreceives a negative MCQ ranking inN
1ifCreceives a positive MCQ ranking inN.



For the sake of brevity, when we say that a news storyN2NisaboutcompanyC,
it is assumed thatNprovides an MCQ ranking forC. Let¼hN 1 ;:::;Nmibe a sequence
of news stories about C from N, chronologically ordered. Then  induces the
binary sequencehSCðN 1 Þ;:::;SCðNmÞi. The problem with defining reversal events from
such sequences is that there is a considerable amount of noise within the flow of news
about a company. If a company is experiencing a period of negative press, it does not
follow that every news story about the company will be negative. An accurate sentiment
metric must smooth out the raw data in some way. To that end, we make the following
definition.


Definition 2.LetN2Nbe a news story aboutC. LetPdenote a certain number of
months. Assume that within the time period ofPmonths before the publication date of
Nthere arem-many stories aboutC(includingN), for somem 2 N. Denote these
storiesN 1 ;:::;Nm¼N. Thetrailing net news sentiment for C at N over Pis the quantity


ðC;N;PÞ¼

Xm

i¼ 1

SCðNiÞ:

The net news sentiment of a company measures a kind of preponderance of sentiment.
For example, if there are 30 news stories aboutCin a given month, 29 of which center
around poor earnings, accounting scandals, and suchlike, but one story mentions the
environmentally friendly policies ofC, there will still be a significantly negative net news
sentiment measurement forCof28, thus reflecting the overwhelming amount of news
pessimism aboutC.
DJNA provides many data attributes within each news story besides MCQ rankings.
In an attempt to focus on the most meaningful data, we restrict our attention to only one
additional attribute, namely therelevance score.IfN2Nis a news story aboutC,a
relevance score is assigned toCon a scale of 0–100 depending on the significance of the
roleCplays inN. We require in the central definition below that each news story is as
relevant as possible and receives a full relevance score of 100. Preliminary work has
shown that conditioning on high relevance is a better window into future performance.


234 News and abnormal returns


(^2) The current format of DJNA data prohibits (at least, not without considerable additional work) the inclusion of ADRs.
Each company is tagged with its country ticker and a country code. It would be necessary to obtain a non-survivor-biased list
of ADRs that matched country tickers with tickers used on US exchanges. A similar approach might be taken using ISIN data,
which is also tagged in each DJNA story. These considerations, however, are beyond our present scope.

Free download pdf