I went along to the Forum on News Analytics over in Canary Wharf on Monday evening, organised by Professor Gautam Mitra from OptiRisk / Carisma at Brunel University. We seem to be in the early days of transforming news articles into quantifiable/machine-readable data so that it can be processed automatically/systematically in trading and risk management. It was a good event with both vendors and practititioners attending so was reasonably balanced between vendor hype and the current state of market practice.
As background on what is meant by news analytics data, then for example you might count the number of news articles about a particular company and look at whether the quantity of news articles might be a predictor of some change in the company's stock price or volatility. Moving on from this simple approach (assuming that you are clever enough to be certain about what news is about what company), then you can then move towards assessing whether the news is negative, neutral or positive in sentiment about a company/stock.
The context here is about having the capability to automatically process/analyse any kind of text-based news story, not just those from research analysts that might be nicely tagged with such quantifiers of sentiment (see http://www.rixml.org/ on xml standards for analyst data). The way in which the meaning of the text is "quantified" uses some form of Natural Language Processing.
The event started with a brief talk by Dan di Bartolemeo of Northfield Information Services. I hadn't heard of him or his company before (maybe I should pay more attention!) but he seemed a very solid speaker with strong academic and practical background in investment management and modelling. He referenced a few academic papers (available via their web site) on news analytics, and how news analytics and implied volatility could provide better estimates of future volatility than implied volatility alone. He also made some good points about how investment "models" are calibrated to history and how such models need to adapt to "today" – he put it as "how are things different now from the past?" and put forward the idea of a framework for assessing and potentially modifying a model to respond to the "now" situation. He also suggested that the market can react very differently to "expected news" (having a range of investment "what ifs" planned for a known earnings announcement) as opposed to unexpected information (we are back into the realms of the Black Swan and the ultimate in uncertainty wisdom from Donald Runsfeld)
Armando Gonzalez of RavenPack then began by explaining how RavenPack had become involved in applying text analysis to finance (it seems the subject has its origins, like a lot of things, in the military). RavenPack seem to be highest profile quantified news vendor at the moment, and whilst Armando is obviously biassed towards pushing the concept that money can be made by adding quantified news data to trading models, he said that not many firms are as yet systematically processing news and most people are relying upon manual interpretation of the news they buy/use. Some of the studies Ravenpack have on market news and prices are very interesting, showing how a news event can take up to 20 mins before the market settles on a new "fair" price level for a stock. Additionally, and maybe an interesting reflection on human behaviour, was that in bull markets there are usually twice as many positive stories about companies than negative, but strikingly in a bear market there was still almost equal amounts of positive and negative news – so humans are basically optimists! (or delusional, or just plain greedy…take your pick!)
Mark Vreijling of Semlab followed Armando and suggested that a lot of their sales prospects understandably desire "proof" of the benefits of adding quantified news to trading, but this was a little ironic since most financial institutions have been paying to receive "raw" news for years, presumably because they perceive beneift from it. Mark also mentioned that the application of quantified news to risk management was a new but growing area for him and his colleagues.
Gurvinder Brar of Macquarie then went into some of the practicallities of quantifying and using news in automated trading. He suggested that you need to understand what is really "news" (containing information on something that has just happened) and what is merely an news "article" (like a "feature" in a magazine etc). Assessing relevance of news was also difficult and he added that setting a hierarchy of what kind of events are important to your trading was a key step in dealing with news data. Fundamentally he suggested that why wait for five days for analysts to publish their assessment of a market or company-specific event when you could react to the event in near real-time.
The event then went into "panel" mode where the following points came out:
- Dan thought that a real challenge was integrating quantified news with all of the other relevant datasets (market data, but also reference data etc)
- Armando picked up on Dan's point by giving the example news about Gillette which at one point was about Gillette the company but then on acquisition became news about the Gillette "brand" which became a part of Proctor and Gamble.
- Dan said that a key problem with processing news was also understanding what news was simply ignored by the news wires i.e. we know what is being talked about, but what could have been talked about, why was it ignored and is it (even so) relevant to trading?
- Mark and Armando said that the "context" for the news story was vital and that market expectations can turn many "negative" news stories into positive outcomes for trading e.g. the market likes bad news when it is not as "bad" as everyone thought.
- Dan made a very interesting point about trading in terms of categorising trades as "want to" trades and "have to" trades. He gave the example of a trade being observed that seemingly has no news associated/prompting it – so does this mean the trade is occuring because somebody "has to" make the trade (a fund facing an welcome client redemption for example?) or because there has been some information leak to a market participant and such a participant "wants to" make a trade before the news becomes available to the market as a whole.
- I think all of the panel members then collectively hesitated before answering the next question from the audience, with Microsoft having one of their "text search" R&D team (think Bing…) asking about news categorisation and quantification.
- Dan also mentioned something that I have only recently become more aware of, which is that apart from major markets in the US, most exchanges world-wide do not publish whether a trade was a "buy" or "sell" trade (they just publish the price and transaction size). Obviously knowing the direction of the trade would be useful to any trading model, and Dan referred to this as wanting to know the "signed volume".
- A member of the audience then asked whether most quantified news had been based on just the English language and the concensus was that most was based on English, but Natural Language Processing can be trained in other languages relatively easily. A few members of the panel pointed out that all languages change, even English, requiring constant retraining, and also that certain languages, countries and cultures added further complication to the recognition process.
- The next question asked was whether the panel could outline the major areas that quantified news is applied in – the answer included intraday (but not quite real-time) trading, algorithmic execution, lower frequency portofolio rebalancing and in compliance/risk/market abuse detection.
- A good debate ensued about whether "news" was provided by the official newswires or by the web itself. The panel (and audience) concensus seemed to favour the premise the news wires are the source of news and the web is a reflection/regurgitation of this news. That said, Gurvinder of Macquarie gave the nice counter example of the analysts/news wires not making much of the new Apple iPod, when looking at the web it was possible to see that the public were in contrast very enthusiastic about it.
Overall an interesting event. I think the application of "quantified news" to risk management is interesting - maths and financial theory is very interesting but markets are driven by people's behaviour and if "quantified news" can help us understand this better it has to help in avoiding (some!) of the future problems to be faced in the market.