To address quantcentration, RavenPack, a pioneer in the field of news analytics,
extracts relevant, actionable content from high-volume, real-time newsfeeds and com-
prehensive news archives. News analytics are made possible by the practical application
of leading edge technologies in the field of computational linguistics. In several cases,
RavenPack has pushed the boundaries of this field to help meet the demanding
requirements of its primary market, the financial industry.
To recognize when any of over 27,000 global companies are mentioned in the news,
RavenPack employs a technology known as named entity recognition. There are many
applications of named entity recognition, but the financial industry imposes a require-
ment that most do not: the names of the entities can change over time, and the same
name might refer to different entities at different times. RavenPack has addressed this by
building a point-in-time-aware named entity recognition system. Practical, leading
solutions like this allow users to address problems like survivorship bias when using
RavenPack News Analytics data.
Identifying entities, such as companies, is just the first step. But knowing what role
each entity plays in a story is when the real value starts to appear. RavenPack has built a
system that looks at financial news stories related to companies and can classify the story
into any of hundreds of categories. For each of these, the company roles are extracted.
So RavenPack News Analytics can tell you who is the analyst firm and which company
is being upgraded, for example. If you are trading on news, knowing this just might
come in handy.
Detecting news events automatically
When dealing with the tens of thousands of stories published about companies every
day, it makes sense to try to classify them into a set of pre-defined categories. RavenPack
has approached this problem by applying technological advancements acquired through
many years of experience and has come up with a solution to categorize stories into a
simple set of themes which are fundamental to today’s investment environment.
The technique of producing this kind of analysis came about by performing a careful
study of the types of stories available on companies and by extracting the primary
categories that would allow meaningful interpretation of a story. Once the categories
had been determined, the goal was to implement technology that could perform the
classification automatically. Some categories are more straightforward than others, so
different techniques are applied.
Events are defined using thousands of proprietary templates and part-of-speech
tagging. These specialized templates are compositions of language tokens or values
taken in specific context. Tokens may be a type of language marker, such as a number
or date. They may be words or phrases, perhaps broken down to their root form or
taken only for a given tense.
Part-of-speech tagging involves marking up the words in a text as corresponding to a
particular part of speech, based on both its definition and its context (i.e., relationship
with adjacent and related words in a phrase, sentence, or paragraph). This makes
templates more scalable, modular, and effective.
As anyone who follows the news knows, stories are often repeated. The impact of a
breaking news event is likely to be more valuable than re-hashes of the same story in the
following hours. So RavenPack News Analytics take event detection another step
Are you still trading without news? 313