Financial Markets Industry
Posts categorized "Data"
A great afternoon event put on by TabbFORUM in New York yesterday with a number of panels and one on one interviews (see agenda). You can see some of went on at the event via the hashtag #TabbTech or via the @XenomorphNews feed.
"Death of Legacy" Panel Discussion
Posted by Brian Sentance | 16 October 2014 | 9:52 pm
Good day at the A-Team's DMS London event last Wednesday. The day started with Tom Dalglish doing a pretty passable impression of a stand-up comedian in the morning keynote to open the day - not exactly an easy thing to do if 1) you are asked to do it very much at the last minute and 2) this is data management, not the subject that most comedians would immediately reach out for. So due kudos to Tom, and some of the comments he made about technology architects and technology builders were funny and resonated with the audience, such as this quote coming from a technologist: "How can I give you the requirements, I haven't finished the code yet?" (I think we have all been there on that one a few times in our careers...).
You can find some of the main points from the various panels at via @XenomorphNews or more generally by #dmslondon (you could also find out a bit via my twitter account @TheLongSentance so long as you don't mind the odd photograph and a few bits of personal baggage now and again).
BCBS239 Panel - I took part in the panel on BCBS239 on risk data aggregation and reporting, something which I have written about before, and obviously a prime example of how regulation is influencing (dictating?) financial markets institutions to take data management seriously. Dennis Slattery of EDMWorks moderated the panel, and on the panel with me was Sally Hinds of DCMS, and Mikael Soboen, head of risk systems at BNP Paribas.
BCBS239 Panel at DMS London
Dennis started by outlining the four pillars of BCBS239:
- Pillar 1 “Overarching Governance and Infrastructure.”
- Pillar 2 “Risk aggregation” capabilities.
- Pillar 3 “Risk reporting” capabilities.
- Pillar 4 “Supervisory review, tools and cooperation."
Regulatory Chicken - Dennis started by asking the panel whether BCBS was another game of regulatory "chicken" where the approach of "principles" means 1) the banks do the minimum and wait for the banks to inspect and tell them what they specifically have to do 2) the regulators don't really want to be more specific beyond principles because they themselves are unsure of what is needed and want to learn from what different banks have done. General concensus from the panel debate was that firms were not doing as much as they could, but that banks needed to show at least that they had a program in place and running by the January 2016 deadline or face big issues with the regulators (so the game of regulatory chicken is "on" seems to be the conclusion). Mikael Soboen added that he was unsure whether his regulator would have the time to conduct the BCBS239 given the workload that the regulators currently faced.
The End of Spreadsheets? - Dennis asked whether BCBS239 and the requirements for having a clear data lineage meant this sounded the bell for the end of spreadsheet usage at banks. I said not - I personally feel that a lot of folks in technology underestimate how difficult using software is for many business users and tools that make manipulating data easy like spreadsheets will have a role for the foreseeable future. I suggested that spreadsheets are a great adhoc reporting and analysis tool, and things mainly go wrong when they are used as a personal, "siloed" desktop database.
BCBS239 does not itself preclude the usage of spreadsheets and end user computing, but rather like a lot of regulation says that their usage must be taken seriously - in my view there is a tendency for some in IT to regard spreadsheets as someone else's problem, which is understandable but problematic for any CDO. Also there are approaches to spreadsheet usage that can help maintain data lineage, such as what Microsoft offers with web provision of spreadsheet dashboards using PowerView and PowerBI (used in our TimeScape MarketPlace offering), folks such as Cluster7 with their "closed circuit TV" for spreadsheet monitoring, and indeed Xenomorph with our SpreadSheet Inside approach of including centralised spreadsheet-like calculations as a supported data type within the audited data management process.
Data Dictionary - Mikael said that one responsibility he had was to represent the investment bank within the wider data dictionary initiatives due to BCBS239 at the retail bank, and said that this was challenging given the different terminology sometimes used.
Is BCBS239 a Project or Data Governance? - The panel thought that the best approach was to use BCBS239 as a framework for compliance with current regulation and regulation to come, but that this needs to obviously be subject to having the budget to do so. There were some general comments on how the data management needs of the front office and risk were converging. Standards such as FIBO were also discussed, with feedback being that they are desirable but that it is early days where their immaturity means they are often used for specific areas such as modeling counterparty data.
Overall a good panel (I hope!) with a good amount of audience questions and participation. Again you can find some of the main points from the various panels at via @XenomorphNews or more generally by #dmslondon (you could also find out a bit via my twitter account @TheLongSentance so long as you don't mind the odd photograph and a few bits of personal baggage now and again).
A bit of fun - Brian looking up to Ron Wilbraham at DMS London
Posted by Brian Sentance | 13 October 2014 | 5:50 pm
A-Team’s DMS Data Management Awards close on the 26th of September so if you haven't already, please vote for Xenomorph!
Xenomorph on the Cloud - First of a few lookbacks at what we have been doing over the past year - firstly with a short animation about one of our major initiatives this year, cloud provision of data management and a new venture into cloud-based data publishing with the TimeScape MarketPlace.
So it would be fantastic if you could support Xenomorph by voting here.
Posted by Brian Sentance | 11 September 2014 | 7:21 pm
One day to go until our TimeScape MarketPlace breakfast briefing "Financial Markets Data and Analytics. Everywhere You Need Them" at Merchant Taylor's Hall tomorrow, Wednesday June 25th. With over ninety people registered so far it should be a great event, but if you can make it please register and come along, it would be great to see you there.
Posted by Brian Sentance | 24 June 2014 | 11:25 am
Less than one week to go until our TimeScape MarketPlace breakfast briefing "Financial Markets Data and Analytics. Everywhere You Need Them" at Merchant Taylor's Hall on Wednesday June 25th.
Come and join Xenomorph, Aite Group and Microsoft for breakfast and hear Virginie O'Shea of the analyst firm Aite Group offering some great insights from financial institutions into their adoption of cloud technology, applying it to address risk management, data management and regulatory reporting challenges.
Microsoft will be showing how their new Power BI can radically change and accelerate the integration of data for business and IT staff alike, regardless of what kind of data it is, what format it is stored in or where it is located.
And Xenomorph will be demonstrating the TimeScape MarketPlace, our new cloud-based data mashup service for publishing and consuming financial markets data and analytics.
In the meantime, please take a look at the event and register if you can come along, it would be great to see you there.
Posted by Brian Sentance | 19 June 2014 | 10:55 am
Very pleased to announce that Mizuho Securities USA has completed a successful implementation of TimeScape, you can see the press release here and more detail is available in this article on Inside Reference Data. Big thank you to all those involved in making this happen, both at Mizuho and on the Xenomorph team.
Posted by Brian Sentance | 18 June 2014 | 11:12 am
Quick thank you to the clients and partners who took some time out of their working day to attend our breakfast briefing, "Financial Markets Data and Analytics. Everywhere You Need Them." at Microsoft's Times Square offices last Friday morning. Not particularly great weather on here in Manhattan so it was great to see around 60 folks turn up...
Posted by Brian Sentance | 14 May 2014 | 9:49 pm
Quick reminder that there are just 7 days left to register for Xenomorph's breakfast briefing event at Microsoft's Times Square offices on Friday May 9th, "Financial Markets Data and Analytics. Everywhere You Need Them."
With 90 registrants so far it looks to be a great event with presentations from Sang Lee of Aite Group on the adoption of cloud technology in financial markets, Microsoft showing the self-service (aka easy!) data integration capabilities of Microsoft Power BI for Excel, and introducing the TimeScape MarketPlace, Xenomorph's new cloud-based data mashup service for publishing and consuming financial markets data and analytics.
Hope to see you there and have a great weekend!
Posted by Brian Sentance | 2 May 2014 | 7:34 pm
Very pleased to announce general availability of TimeScape Data Validation Dashboard which we announced this morning. You can see find out more here. Big thank you to all the staff and the clients involved, who have helped us to put this together over the past year.
Posted by Brian Sentance | 30 April 2014 | 4:30 pm
The New York Chapter of PRMIA hosted "Regulatory, Compliance, and Risk Data Technology Challenges" at Credit Suisse's offices in New York, last Thursday 10th April. Abraham Thomas introduce the panelists, and Don Wesnofske started off by setting the scene for the evening's event.
Don outlined how in reaction to the 2008 Crisis the regulators now require data retention for up to 10 years or more. Don cited one particular example where data must be reconstructed within 24 to 48 hours for any date up to 7 years back, and said that this kind of "forensic" investigation capability was an important consideration for many financial institutions. He took us through a good presentation slide of his view on data management/risk architecture, and outlined how operational risk is comprised of people, process, technology and events. Don ended his presentation by taking us through Wikipedia's definition of "Big Data", and in particular talked about how data has a life cycle going through:
Don handed then handed over to Luigi Mercone of Credit Suisse who is a Director of Engineering Strategy & Architecture at Credit Suisse. Luigi started by saying that to the business at CS, he is technical support which involves asking "What is on fire today? And whats going to be on fire tomorrow?" Luigi described how some time back CS had regulatory enquiry around their equities business which required them to reconstruct data from 2 years back.
The project to do this took around 4-5 months of database adminstrators time to reconstruct the world as at that point in time (I guess because tape storage was being used, and this needed restoring to disk/database). This was for an equity order management system that had doubled in size every year for the past 17 years, and at that point CS was only retaining data going back 2 years. Luigi said that it was then thought that with new regulations requiring the ability to produce forensice evidence at any point in time would potentially swamp CS's resources unless it was addressed head on and strategically.
Luigi described the original architecture that they were using being based on an in-memory database for intraday workloads, then standard Sybase (probably ASE I guess) and then Sybase IQ for longer term archiving, taking advantage of the column-store capabilities of Sybase IQ and the resulting data compression possible. He added that the data storage requirements of the system had grown from 150TB to 1.2PB in 4 years.
Luigi then offered a comparison of this original architecture with what he found by implementing RainStor, in the original architecture the Sybase IQ database compressed data down into 160TB, whereas this was improved by a further factor of 10 down to 14TB using RainStor. He said that the RainStor was self-service providing a standard SQL interface, eliminated the need for tape storage, reduced the system "footprint" by 90% at CS, was 1/5 of the cost and the performance was good. (I guess here I would like to caveat that I know nothing of the original architecture other than the summary Luigi provided, and as such it is hard to judge whether the original architecture was optimal for the data growth experienced, and hence whether this was overall an objective comparison of Sybase IQ's capabilities with RainStor.) Luigi closed by saying that whilst RainStor was a great archive database, its original origins were in in-memory databases and he would encourage RainStor to re-enter that market too, given his experience so far.
John Bantleman CEO of RainStor took over and described how RainStor had been designed specifically for the needs of data archiving (I guess talking more about what it does now rather than its origins outlined by Luigi above). He said that RainStor offers a 20-40x storage footprint reduction over traditional database technology and operates efficiently even at the PetaByte (PB) scale, based around RainStor proprietary database technology making use of columnar storage and being capable of storing data in both relational-style tabular format and also in more "document" style using XML and JSON formats using Key-Value access. John mention that in terms of being able to store data that not only could RainStor retrieve data at a point in time, but it could retrieve the schema being used at that point in time for a more complete view of the state of the world at that point. This echos a couple of past articles that I have penned, one for IRD and one for Wilmott Magazine on bitemporal regulatory requirements.
John said that regulation was driving the need for data archiving capabilities, with 1400 regulations added since 2008 (not sure of source, but believable) and the comment from a Chief Data Officer (CDO) at one financial markets client that if a project wasn't driven by regulatory compliance then the project isn't going to get done (certainly sounds like regulatory overload). John's opening remarks were really around how regulatory cost, complexity and compliance were driving forces behind the growth of RainStor in financial services technology, and whilst regulation is the driver, firms should look at archiving of data as an opportunity too, in order to create value from corporate memory, and to be proactive in addressing future reporting and analysis needs.
John illustrated the regulatory need for data archiving through the Consolidated Audit Trail (CAT) regulation with data retention over 7 years will generate 100PB of data. He also mentioned SEC Rule 17a-4 for broker dealers as another example of "data retention" regulation, with particular reference to storage of records in on-rewriteable, non-erasable format. John termed this WORM storage, meaning Write Once, Read Many. John seemed to imply that both the software (RainStor) and the hardware it runs on (e.g. EMC or Teradata etc) need to be WORM compliant. One of the audience members asked John about BCBS 239, to which John said that he didn't know that particular regulation (fair enough that John didn't know in my opinion, RainStor's tech is general about "data" and is applicable across many industries, whereas BCBS 239 is obviously about banks specifically and is more about data aggregation and reporting than data retention/archiving to my understanding, and this seems to be confirmed with a quick doc scan for "archive" or "retention".)
To finish off the main part of the event (before the drinks and food began) there was a panel discussion. Luigi said that it was best to "prepare for all time, not just specifics" with respect to data retention and that there were dangers in rolling up data (effectively aggregating and loosing granularity to reduce storage needs). John added that his definition of "Big Data" was "All information, for ever". Luigi added that implementing RainStor had allowed CS to spend more time on interesting questions rather than on database restoration. John proposed that version 1 of Big Data involved the retention of web data, and as such loosing a data point here and their didn't matter. Version 2 of Big Data is concerned more with enterprise data where all data has value and needs to be retained i.e. lots of high value data. He added that this was an opportunity for risk and compliance to become an asset.
Abraham (second from left), Don (center) and John (second from right)
Overall it was a good event which I found very interesting (but I have to admit to a certain geeky interest in this kind of tech). The event would have benefitted from say another competitive or complementary technology vendor involved maybe, plus maybe an academic to give a different slant on data retention and on what the regulators hope to gain from this kind of mandated data retention. Not that the regulators have been that good at managing data themselves recently.
Networking afterwards courtesy of Credit Suisse and RainStor
Posted by Brian Sentance | 17 April 2014 | 3:05 pm
Very pleased to announce that Xenomorph will be hosting an event, "Financial Markets Data and Analytics. Everywhere You Need Them.", at Microsoft's Times Square New York offices on May 9th.
This breakfast briefing includes Sang Lee of the analyst firm Aite Group offering some great insights from financial institutions into their adoption of cloud technology, applying it to address risk management, data management and regulatory reporting challenges.
Microsoft will be showing how their new Power BI can radically change and accelerate the integration of data for business and IT staff alike, regardless of what kind of data it is, what format it is stored in or where it is located.
And Xenomorph will be introducing the TimeScape MarketPlace, our new cloud-based data mashup service for publishing and consuming financial markets data and analytics. More background and updates on MarketPlace in coming weeks.
In the meantime, please take a look at the event and register if you can come along, it would be great to see you there.
Posted by Brian Sentance | 15 April 2014 | 3:57 pm
Good article from Tim Harford (he of the enjoyable "Undercover Economist" books) in the FT last week called "Big data: are we making a big mistake". Tim injects some healthy realism into the hype of Big Data without dismissing its importance and potential benefits. The article talks about the four claims often made when talking about Big Data:
- Data analysis often produces uncannily accurate results
- Make statistical samplying obsolete by capturing all the data
- Statistical correlation is all you need - no need to understand causation
- Enough data means that scientific or statistical models aren't needed
Now models can have their own problems, but I can see where he is coming from, for instance 3. and 4. above seem to be in direct contradiction. I particularly like the comment later in the article that "causality won't be discarded, but it is being knocked off its pedestal as the primary fountain of meaning."
Also I liked the definition by one of the academics mentioned of a big data set being one where "N = All", and that you have "all" the data is an incorrect assumption behind some Big Data analysis put forward. Large data sets can mean that sample error is low, but sample bias is still a potentially big problem - for example everyone on Twitter is probably not representative of the population of the human race in general.
So I will now press save on this blog post, publish in Twitter and help re-enforce the impression that Big Data is a hot topic...which it is, but not for everyone I guess is the point.
Posted by Brian Sentance | 7 April 2014 | 5:09 pm
Melissa Sexton of Morgan Stanley introduced the agenda, saying that the evening would focus on three aspects of liquidity risk management:
- industry practice
LiquidityMetrics by MSCI - Carlo Acerbi of MSCI then took over with his presentation on "LiquidityMetrics". Carlo said that he was pleased to be involved with MSCI (and RiskMetrics, aquired by MSCI) in that it had helped to establish and define standards for risk management that were used across the industry. He said that liquidity risk management was difficult because:
- Clarity of Definition - Carlo suggest that if he asked the audience to define liquidity risk he would receive 70 differing definitions. Put another way, he suggested that liquidity risk was "a strange animal with many faces".
- Data Availability - Carlo said that there were aspects of the market that we unobservable and hence data was scarce/non-existent and as such this was a limit on the validity of the models that could be applied to liquidity risk.
Carlo went on to clarify that liquidity risk was different depending upon the organization type/context being considered, with banks obviously focusing on funding. He said that LiquidityMetrics was focused on asset liquidity risk, and as such was more applicable to the needs of asset managers and hedge funds given recent regulation such as UCITS/AIFMD/FormPF. The methodology is aimed at bringing traditional equity market impact models out from the trading floor across into risk management and across other asset classes.
Liquidity Surfaces - LiquidityMetrics measures the expected price impact for an order of a given size, and as such has dimensions in:
- order size
- liquidity time horizon
- transaction costs
The representation shown by Carlo was of a "liquidity surface" with x dimension of order size (both bid and ask around 0), y dimension of time horizon for liquidation and z (vertical) dimension of transaction cost. The surface shown had a U-shaped cross section around zero order size, at which the transaction cost was half the bid-ask spread (this link illustrates my attempt at verbal visualization). The U-shape cross section indicates "Market Impact", its shape over time "Market Elasticity" and the limits for what it is observable "Market Depth".
Carlo then moved to consider a portfolio of instruments, and how obligations on an investment fund (a portfolio) can be translated into the estimated transaction costs of meeting this obligations, so as to quantify the hidden costs of redemption in a fund. He mentioned that LiquidityMetrics could be used to quantify the costs of regulations such as UCITS/AIFMD/FormPF. There was some audience questioning about portfolios of foreign assets, such as holding Russian Bonds (maybe currently topical for an audience member maybe?). Carlo said that you would use both the liquidity surfaces for both the bond itself and the FX transaction (and in FX, there is much data available). He was however keen to emphasize that LiquidityMetrics was not intended to be used to predict "regime change" i.e. it is concerned with transaction costs under normal market conditions).
Model Calibration - In terms of model calibration, then Carlo said that the established equity market impact models (see this link for some background for instance) have observable market data to work with. In equity markets, traditionally there was a "lit" central trading venue (i.e. an exchange) with a star network of participants fanning out from it. In OTC markets such as bonds, there is no star network but rather many to many linkages establised between all market participants, where each participant may have a network of connections of different size. As such there has not been enough data around to calibrate traditional market impact models for OTC markets. As a result, Carlo said that MSCI had implemented some simple models with a relatively small number of parameters.
Two characteristics of standard market impact models are:
- Permanent Effects - this is where the fair price is impacted by a large order and the order book is dragged along to follow this.
- Temporary Effects - this is where the order book is emptied but then liquidity regenerates
Carlo said that the effects were obviously related to the behavioural aspects of market participants. He said that the bright side for bonds (and OTC markets) was given that the trades are private there was no public information, and price movements were often constrained by theoretical pricing, therefore permanent effects could be ignored and the fair price is insenstive to trading (again under "normal" market conditions). Carlo then moved on to talk about some of the research his team was doing looking at the shape of the order book and the time needed to regenerate it. He talked of "Perfectly Elastic" markets that digest orders immediately and "Perfectly Plastic" markets that never regenerate, and how "Relaxation Time" measures in days how long the market takes to regenerate the order book.
Liquidity Observatory - Carlo described how the data was gathered from market participants on a monthly basis using a spreadsheet to categorize the bond/asset class type, and again using simple parameters from active "expert" traders. Take a look at this link and sign up if this is you. (This sounded to me a lot like another "market consensus" data gathering exercise which are proving increasingly popular, such as one the first I had heard of many years back in Totem - we are not quite fully ready for "crowdsourcing" in financial markets maybe, but more people are seeing sense in sharing data.).
Panel Debate - Ron Papenek of MSCI was moderator of the panel, and asked Karen Cassidy of Morgan Stanley about her experiences in liquidity risk management.
Liqudity Risk Management at Banks - Karen started by saying that in liquidity management at Morgan Stanley they look at:
- Operating Capital
- Client Behaviour
Since 2008, Karen said that liquidity management had become a lot more rigorous and formalized, being rule based and using a categorisation of assets held from highly liquid to highly illiquid. She said that Morgan Stanley undertake stress testing by market and also by idiosyncratic risk over time frames of 1 month and 1 year. As part of this they are assessing the minimum operating liquidity needed based on working capital needs.
Karen added that Morgan Stanley are expending a lot of effect currently on data collection and modelling given that their data is specific to a retail broker-dealer unit, unlike many other firms. They are also looking at metrics around financial advisors, and how many clients follow the financial advisor when he or she decides to switch firms.
Business or Regulation Driving Liquidity Risk Management - Ron asked Karen what were the drivers of their processes at Morgan Stanley. Karen said that in 2008 the focus was on fundability of assets, saying that the FED was monitoring this on a daily basis. She made the side comment that this monitoring was not unusual since "Regulators live with us anyway". Karen said that it was the responsibility of firms to come up with the controls and best practice needed to manage liquidity risk, and that is what Morgan Stanley do anyway.
Karen added that in her view the industry was over-funding and funding too long in response to regulation, and that funding would be at lower but still pragmatic levels in the absence of regulatory pressure. Like many in the industry, Karen thought the regulation had swung too far in response to the 2008 crisis and would eventually swing back to more normal levels.
Carlo added that he had written an unintentionally prescient academic paper on liquidity management in 2008 just prior to the crisis hitting, and he thought the regulators certainly arrived "after" the crisis rather than anticipating it in any way. He thought that the banks have anticipated the regulators very well with measures such as LCR and SFR already in place.
In contrast, Carlo said that the regulators were lost in dealing with liquidity risk management for asset managers and hedge funds, with regulation such as UCITS being very vague on this topic and regulators themselves seeking guidance from the industry. He recounted a meeting he had with BaFin in 2009 where he told them that certain of their regulations made no sense and he said they acknowledge this and said the asset management industry needed to tell them what to implement (sounds like the German regulator is using the same card as the UK regulators in keeping regulations vague when they are uncertain, waiting for regulated firms to implement them to see what the regulation really becomes...).
What Have We Learnt Since 2008 - Karen said that back in 2008 liquidity was not managed to term, funding basis was not rigorous and relied heavily on unsecured debt. She said that since then Morgan Stanley had been actively involved in shaping the requirements of better liquidity risk management with more rigorous analysis of counterparties and funding capacity. Karen said that stronger governance was a foundation for the creation of better policy and process. She said that regulators were receptive to new ideas and had been working with them closely.
What will be the effect of CCPs on OTC markets? Carlo said that when executing a large order, you have the choice between executing 1) multiple small orders with multiple counterparties or 2) a single large block order with one counterparty. In this regard, the equity and bond markets are very different. In lit equity venues, the best approach is 1), but in the bond markets approach 2) is taken since the trade information is not transparent to the market.
Obviously equity markets have become more fragmented, and this has resulted in improve market quality since it is harder to get all market information and hence the market is less resonant to big events/orders. Carlo added that with the increased transparency proposed for OTC markets with CCPs etc will this improve them? His answer was that this was likely to improve the counterparty risk inherent in the market but due to increased transaparency is likely to have a negative effect on transaction costs (I guess another example of the law of unintended consequencies for the regulators).
Audience Questions - there then followed some audience questions:
LiqidityMetrics extrapolation - one audience member asked about transaction cost extrapolation in Carlo's modelling. Carlo said that MSCI do not extrapolate and the liquidity surface terminates where the market terminates its liquidity. There was some extrapolation used along the time dimension however particularly in relation to the time-relaxation parameter.
LiquidityMetrics "Cross-Impact" - looking at applying LiquidityMetrics to a portfolio, one audience member wondering if an order for one asset distorted the liquidity surface for other potentially related assets. Carlo said this was a very interesting area with little research done so far. He said that this "cross-impact" had not been detected in equity markets but that they were looking at it in other markets such as fixed income where effective two assets might be proxies for duration related trading. Carlo put forward a simple model of where the two assets are analogous to two species of animal feeding from the same source of food.
Long and short position liquidity modelling - one audience member asked Carlo what the effects would be of being long or short and that in a crisis you would prefer to be short (maybe obviously?) given the sell off by those with long positions. Carlo clarified that being "short" was not merely taking the negative number on a liquidity surface for a particular asset but rather a "short" is a borrowing position with an obligation to deliver a security at some defined point, and as such is a different asset with its own liquidity surface.
Changing markets, changing participants - final question of the evening was from one member of the audience who asked if the general move out of fixed income trading by the banks over recent years was visible in Carlo's data? Carlo said that MSCI only have around two years of data so far and as such this was not yet visible but his team are looking for effects like this amongst others. He added that the August 2011 weak banks - weak sovereigns in Europe was visible with signals present in the data.
Good food and good (really good I thought) wine put on by MSCI at the event reception. Great view of Manhattan from the 48th floor of World Trade Centre 7 too.
Posted by Brian Sentance | 31 March 2014 | 11:35 am
The second panel of the day was "Regulation and Risk as Data Management Drivers" - you can find the A-Team's write up here. Some of my thoughts/notes can be found below:
- Ian Webster of Axioma responded to a question about whether consistency was the Holy Grail of data management said that there isn't consistent view possible for data used in risk and regulation - there are many regulations with many different requirements and so unnecessary data consistency is "the hobgoblin of little minds" in delaying progress and achieving goals in data management.
- James of Lombard Risk suggest that firms should seek competitive advantage from regulatory compliance rather than just compliance alone - seeking the carrot and not just avoiding the stick.
- Ian said he thought too many firms dealt with regulatory compliance in a tactical manner and asked if regulation and risk were truly related? He suggested that risk levels might remain unchanged even if regulation demanded a great deal more reporting.
- Marcelle von Wendland said she thought that regulation added cost only, and that firms must focus on risk management and margin.
- James said that "regulatory risk" was a category of risk all in itself alongside its mainstream comtempories.
- Ian added that risk and finance think about risk differently and this didn't help in promoting consistency of ideas in discussions about risk management.
- James said that the legacy of systems in financial markets was a hindrince in complying with new regulation and mentioned the example of the relatively young energy industry where STP was much easier to implement.
- Laurent of Bloomberg said that young, emerging markets like energy were greenfield and as such easier to implement systems but that they did not have any experience or culture around data governance.
- Marcelle said that the G20 initiatives around trade reporting at least promoted some consistency and allowed issues to be identified at last.
- Ian said in response that was unconvinced about politically driven regulation, questioning its effectiveness and motivations.
- Ian raised the issues of the assumptions behind VaR and said that the current stress tests were overdone.
- Marcelle agreed that a single number for VaR or some other measure meant that other useful information has potentially been ignored/thrown away.
- General consensus across the panel that fines were not enough and that restricting business activities might be a more effective stick for the regulators.
- James reference the risk data aggregation paper from the Basel Committee and suggested that data should be capture once, cleaned once and used many times.
- Ian disagreed with James in that he thought clean once, capture once and use many times was not practically possible and this goal was one of the main causes of failure within the data management industry over the past 10 years.
- The panel ended with Ian saying that we not just solve for the last crisis, but the underlying causes of crises were similar and mostly around asset price bubbles so in order to recuce risk in the system 1) lets make data more transparent and 2) do what we can to avoid bubbles with better indices and risk measures.
Posted by Brian Sentance | 24 March 2014 | 6:07 pm
Rupert Brown of UBS did the keynote at this Spring's A-Team Data Management Summit (DMS). Rupert's talk was about understanding what data there is within a financial institution and understanding where it comes from and where it goes to. Rupert started by asking the question "Where are we?" illustrating it with a map of systems and data flows for an institution - to my recollection I think he said it stretched to 7 metres in length and did not look that accessible or easy to understand. He asked what dimensions it should have as a "map" of data, wondering what dimensions are analogous to latitude, longitude, altitude and orientation? Maybe things like function, product, process, accounting or legal entity as potential candidates.
Briefly Rupert took a bit of a detour into his love of trains with a little history on the London Underground Map. He started by mentioning the role of George Dow who illustrated maps for train routes in a single line, showing just dependency and lineage (what stations are next etc) and ignoring geography and distance. This was built upon by another gentleman, Harry Beck, who took these ideas a stage further with the early ancestors of the current Undergroud map, showing both routes but interweaving all the lines together into a map that additionally was topologically sufficient (indicating broad direction - NESW).
Continuing on with this analogy of Underground to maps of data and data management, Rupert then mentioned Frank Pick who created the Underground brand. Through creating such an identifiable brand, effectively Frank got people to believe and refer to the map, and that people in data governance need and could benefit from taking a similar approach to data governance with data management. I guess it is easy to take maps we see every day for granted and particularly some of the thought that went into them, maybe ideas that initially were not intuitive (or at least not directly representative of physical reality) but that greatly improved understand and comprehension. Put another way, representing reality one for one does not necessarily get you to something that is easy to understand (sounds like a "model" to me).
Rupert then described some of his efforts using Open Street Map to map data, making use of the concepts of nodes, ways and areas. Apparently he had implemented this using a NoSQL database (Mark Logic) for performance reasons (doesn't sound like a really "big data" sized problem with several hundred apps and several thousand data transports but nevertheless he said it was needed, maybe as a result of its graph like nature?). He said that the data was crowdsourced to refine the data, with a wiki for annotations. He said he was interested in the bitemporality of data, i.e. how the map changes over time. He advised that every application should also be thought of as its own "databus" in addition to any de facto databuses might be present in the architecture.
In summary the talk was interesting, but it was demonstrable from what Rupert showed that we have long way to go in representing clearly and easily where data came from, where it goes to and how it is used. I think Rupert acknowledges this and has some academic partnerships trying to develop better ways of representing and visualizing data. Certainly data lineage and audit trail on everything is a hot topic for many of our clients currently, and something that deserves more attention. You can download Rupert's presentation here and the A-Team's take on his talk can be found here.
Posted by Brian Sentance | 18 March 2014 | 11:12 am
Christian Nilsson of S&P CIQ followed up Richard Burtsal's talk with a presentation on data management for risk, containing many interesting questions for those considering data for risk management needs. Christian started his talk by taking a time machine back to 2006, and asking what were the issues then in Enterprise Data Management:
- There is no current crisis - we have other priorities (we now know what happened there)
- The business case is still too fuzzy (regulation took care of this issue)
- Dealing with the politics of implementation (silos are still around, but cost and regulation are weakening politics as a defence?)
- Understanding data dependencies (understanding this throughout the value chain, but still not clear today?)
- The risk of doing it wrong (there are risk you will do data management wrong given all the external parties and sources involved, but what is the risk of not doing it?)
Christian then moved on to say the current regulatory focus is on clearer roadmaps for financial institutions, citing Basel II/III, Dodd Frank/Volker Rule in the US, challenges in valuation from IASB and IFRS, fund management challenges with UCITS, AIFMD, EMIR, MiFID and MiFIR, and Solvency II in the Insurance industry. He coined the phrase that "Regulation Goes Hollywood" with multiple versions of regulation like UCITS I, II, III, IV, V, VII for example having more versions than a set of Rocky movies.
He then touched upon some of the main motivations behind the BCBS 239 document and said that regulation had three main themes at the moment:
- Higher Capital and Liquidity Ratios
- Restrictions on Trading Activities
- Structural Changes ("ring fence" retail, global operations move to being capitalized local subsidiaries)
Some further observations were on what will be the implications of the effective "loss" of globablization within financial markets, and also what now can be considered as risk free assets (do such things now exist?). Christian then gave some stats on risk as a driver of data and technology spend with over $20-50B being spent over the next 2-3 years (seems a wide range, nothing like a consensus from analysts I guess!).
The talk then moved on to what role data and data management plays within regulatory compliance, with for example:
- LEI - Legal Entity Identifiers play out throughout most regulation, as a means to enable automated processing and as a way to understand and aggregate exposures.
- Dodd-Frank - Data management plays within OTC processing and STP in general.
- Solvency II - This regulation for insurers places emphasis on data quality/data lineage and within capital reserve requirements.
- Basel III - Risk aggregation and counterparty credit risk are two areas of key focus.
Christian outlined the small budget of the regulators relative to the biggest banks (a topic discussed in previous posts, how society wants stronger, more effective regulation but then isn't prepared to pay for it directly - although I would add we all pay for it indirectly but that is another story, in part illustrated in the document this post talks about).
In addtion to the well-known term "regulatory arbitrage" dealing with different regulations in different jurisdictions, Christian also mentioned the increasingly used term "subsituted compliance" where a global company tries to optimise which jurisdictions it and its subsidiaries comply within, with the aim of avoiding compliance in more difficult regimes through compliance within others.
I think Christian outlined the "data management dichotomy" within financial markets very well :
- Regulation requires data that is complete, accurate and appropriate
- Industry standards of data management and data are poorly regulated, and there is weak industry leadership in this area.
(not sure if it was quite at this point, but certainly some of the audience questions were about whether the data vendors themselves should be regulated which was entertaining).
He also outlined the opportunity from regulation in that it could be used as a catalyst for efficiency, STP and cost base reduction.
Obviously "Big Data" (I keep telling myself to drop the quotes, but old habits die hard) is hard to avoid, and Christian mentioned that IBM say that 90% of the world's data has been created in the last 2 years. He described the opportunities of the "3 V's" of Volume, Variety, Velocity and "Dark Data" (exploiting underused data with new technology - "Dark" and "Deep" are getting more and more use of late). No mention directly in his presentation but throughout there was the implied extension of the "3 V's" to "5 V's" with Veracity (aka quality) and Value (aka we could do this, but is it worth it?). Related to the "Value" point Christian brought out the debate about what data do you capture, analyse, store but also what do you deliberately discard which is point worth more consideration that it gets (e.g. one major data vendor I know did not store its real-time tick data and now buys its tick data history from an institution who thought it would be a good idea to store the data long before the data vendor thought of it).
I will close this post taking a couple of summary lists directly from his presentation, the first being the top areas of focus for risk managers:
- Counterparty Risk
- Integrating risk into the Pre-trade process
- Risk Aggregation across the firm
- Risk Transparency
- Cross Asset Risk Reporting
- Cost Management/displacement
The second list outlines the main challenges:
- Getting complete view of risk from multiple systems
- Lack of front to back integration of systems
- Data Mapping
- Data availability of history
- Lack of Instrument coverage
- Inability to source from single vendor
- Growing volumes of data
Christian's presentation then put forward a lot of practical ideas about how best to meet these challenges (I particularly liked the risk data warehouse parts, but I am unsurprisingly biassed). In summary if you get the chance then see or take a read of Christian's presentation, I thought it was a very thoughtful document with some interesting ideas and advice put forward.
Posted by Brian Sentance | 12 March 2014 | 10:34 am
Attended a good event at S&P Capital IQ's offices on Tuesday morning last week in London, built around the BCBS 239 document on risk aggregation and reporting (see earlier PRMIA event on this topic too). A partner vendor of S&P CIQ, Tech Mahindra, started the morning with Richard Burtsal's presentation on "Delivering an Enterprise Data Strategy". Tech Mahindra recently acquired a data management platform from UBS Asset Management and are offering a managed service data management offering based on this (see A-Team article).
Richard said that he wasn't going to "sell" in his presentation (always a worrying admission from one of us data management vendors, it usually means entirely the opposite). That small criticism aside, Richard gave a solid update on the state of the industry and obviously on what Tech Mahindra are offering, and added that:
- For every $1 spent directly on market data, the total cost of that data goes up by a factor of 6 by the time the data is actually used
- 33% of rejected trades are caused by incorrect reference data
- 60% of staff manipulate, report on or support data on a daily basis (I wonder what the other 40% actually do then? Be good to get the Tower Group report this came from to find out maybe?)
- 25% of reference data management is wasted due to duplication and inefficiences
- In their work with UBS Asset Management they had jointly shown that the cost of data management were reduced by 25-30% using a managed service (sounds worth verifying what the "before" situation was I guess, but interesting/impressive).
- Clients were pushing for much faster instrument setup and a reduction in time from the 1-2 weeks setup in some systems.
There were a few questions from the audience during Richard's talk, the first asked about the differences in doing data management with the buy-side and data management on the sell-side. Richard said that his experience was that the buy-side managed less instruments (<500,000) but with greater depth of data, and sell-side held more instruments (10M+) but with less depth of data (not sure that completely reflects my experience, but sounds worth a survey maybe).
The second question was why is the utility model for data management going to succeed right now, when previous attempts over the past 10 years had failed? Richard responded that he thought Tech Mahindra would succeed due to:
- Tech Mahindra are data-vendor agnostic (I assume aimed at Markit-Cadis and Bloomberg-PolarLake)
- Tech Mahindra own all their own IP (hmm, not really so sure this is a good reason or even a differentiator, but a I guess aimed at managed services that are not run by the firm that develops the data management system?)
I think the answers to this second question need thinking through more clearly, to be fair Richard had stated the 25% cost reduction already as one benefit, and various folks have said that the technology is ripe for these kinds of offerings now, but all the same the response need to be more fully developed to convince many I think (I remain undecided personally, it would be good to have some more evidence to back this up). One of the S&P CIQ added that what he thinks clients want is "Utility of Delivery" and not "Utility of Content" which I thought was a sensible comment and one that I will be revisiting in the coming months.
On a related note to why managed services just now, another audience member asked how client specific data was managed within a utility or managed service model, and Richard said that client specific data was often managed at the client but that they can upload and integrate client generated data into the managed service offering. I think this is a very key issue within the debate about managed services and utilities, I mean I get the point the data utility proponents make that certain datasets are simple "facts" as such are either write or wrong and hence commoditisable, but much of the data is subjective and all of the data needs validating together in the context of its intended use in my view. I guess I kind of loose myself in looping arguments about why data utility vendors aren't ultimately wanting to be the next Thomson Reuters or Bloomberg (not that that is not a laudible aim but it is not going to change the world or indeed financial markets data provision very much).
Posted by Brian Sentance | 10 March 2014 | 10:41 am
Xenomorph is sponsoring the networking reception at the A-Team DMS event in London this week, and if you are attending then I wanted to extend a cordial invite to you to attend the drinks and networking reception at the end of day at 5:30pm on Thursday.
In preparation for Thursday’s Agenda then the blog links below are a quick reminder of some of the main highlights from last September’s DMS:
- Data Architecture: Sticks or Carrots?
- What Will Drive Data Management?
- Big Data, Cloud, In-Memory
- The Chief Data Officer Challenge
- Managed Services and the Utility Model
I will also be speaking on the 2pm panel “Reporting for the C-Suite: Data Management for Enterprise & Risk Analytics”. So if you like what you have heard during the day, come along to the drinks and firm up your understanding with further discussion with like-minded individuals. Alternatively, if you find your brain is so full by then of enterprise data architecture, managed services, analytics, risk and regulation that you can hardly speak, come along and allow your cerebellum to relax and make sense of it all with your favourite beverage in hand. Either way your you will leave the event more informed then when you went in...well that’s my excuse and I am sticking with it!
Hope to see you there!
Posted by Brian Sentance | 3 March 2014 | 6:33 pm
Very pleased that our partnering with Aqumin and their AlphaVision visual landscapes has been announced this week (see press release from Monday). Further background and visuals can be found at the following link and for those of you that like instant gratification please find a sample visual below showing some analysis of the S&P500.
Posted by Brian Sentance | 11 December 2013 | 11:41 am
Quick plug for the New York version of F# in Finance event taking place next Wednesday December 11th, following on from the recent event in London. Don Syme of Microsoft Research will be demonstrating access to market data using F# and TimeScape. Hope to see you there!
Posted by Brian Sentance | 6 December 2013 | 7:49 am
Quick thank you to Don Syme of Microsoft Research for including a demonstration of F# connecting to TimeScape running on the Windows Azure cloud in the F# in Finance event this week in London. F# is functional language that is developing a large following in finance due to its applicability to mathematical problems, the ease of development with F# and its performance. You can find some testimonials on the language here.
Don has implemented a proof-of-concept F# type provider for TimeScape. If that doesn't mean much to you, then a practical example below will help, showing how the financial instrument data in TimeScape is exposed at runtime into the F# programming environment. I guess the key point is just how easy it looks to code with data, since effectively you get guided through what is (and is not!) available as you are coding (sorry if I sound impressed, I spent a reasonable amount of time writing mathematical C code using vi in the mid 90's - so any young uber-geeks reading this, please make allowances as I am getting old(er)...). Example steps are shown below:
Referencing the Xenomorph TimeScape type provider and creating a data context:
Connecting to a TimeScape database:
Looking at categories (classes) of financial instrument available:
Choosing an item (instrument) in a category by name:
Looking at the properties associated with an item:
The intellisense-like behaviour above is similar to what TimeScape's Query Explorer offers and it is great to see this implemented in an external run-time programming language such as F#. Don additionally made the point that each instrument only displays the data it individually has available, making it easy to understand what data you have to work with. This functionality is based on F#'s ability to make each item uniquely nameable, and to optionally to assign each item (instrument) a unique type, where all the category properties (defined at the category schema level) that are not available for the item are hidden.
The next event for F# in Finance will take place in New York on Wednesday 11th of December 2013 in New York, so hope to see you there. We are currently working on a beta program for this functionality to be available early in the New Year so please get in touch if this is of interest via firstname.lastname@example.org.
Posted by Brian Sentance | 27 November 2013 | 6:00 am
An exciting departure from Xenomorph's typical focus on data management for risk in capital markets, but one of our partners, i2i Logic, has just announced the launch of their customer engagement platform for institutional and commercial banks based on Xenomorph's TimeScape. The i2i Logic team have a background in commercial banking, and have put together a platform that allows much greater interaction with a corporate client that a bank is trying to engage with.
Hosted in the cloud, and delivered to sales staff through an easy and powerful tablet app, the system enables bank sales staff to produce analysis and reports that are very specific to a particular client, based upon predictive analytics and models applied to market, fundamentals and operational data, initially supplied by S&P Capital IQ. This allows the bank and the corporate to discuss and understand where the corporate is when benchmarked against peers in a variety of metrics current across financial and operational performance, and to provide insight on where the bank's services may be able to assist in the profitability, efficiency and future growth of the corporate client.
Put another way, it sounds like the corporate customers of commercial banks are in not much better a position than us individuals dealing with retail banks, in that currently the offerings from the banks are not that engaging, generic and very hard to differentiate. Sounds like the i2i Logic team are on to something, so I wish them well in trying to move the industry's expectations of customer service and engagement, and would like to thank them for choosing TimeScape as the analytics and data management platform behind their solution.
Posted by Brian Sentance | 19 November 2013 | 2:17 pm
Another good event from PRMIA at the Harmonie Club here in NYC last week, entitled Risk Data Agregation and Risk Reporting - Progress and Challenges for Risk Management. Abraham Thomas of Citi and PRMIA introduced the evening, setting the scene by refering to the BCBS document Principles for effective risk data aggregation and risk reporting, with its 14 principles to be implemented by January 2016 for G-SIBs (Globally Systemically Important Banks) and December 2016 for D-SIBS (Domestically Systemically Important Banks).
The event was sponsored by SAP and they were represented by Dr Michael Adam on the panel, who gave a presentation around risk data management and the problems have having data siloed across many different systems. Maybe unsurprisingly Michael's presentation had a distinct "in-memory" focus to it, with Michael emphasizing the data analysis speed that is now possible using technologies such as SAP's in-memory database offering "Hana".
Following the presentation, the panel discussion started with a debate involving Dilip Krishna of Deloitte and Stephanie Losi of the Federal Reserve Bank of New York. They discussed whether the BCBS document and compliance with it should become a project in itself or part of existing initiatives to comply with data intensive regulations such as CCAR and CVA etc. Stephanie is on the board of the BCBS committee for risk data aggregation and she said that the document should be a guide and not a check list. There seemed to be general agreement on the panel that data architectures should be put together not with a view to compliance with one specific regulation but more as a framework to deal with all regulation to come, a more generalized approach.
Dilip said that whilst technology and data integration are issues, people are the biggest issue in getting a solid data architecture in place. There was an audience question about how different departments need different views of risk and how were these to be reconciled/facilitated. Stephanie said that data security and control of who can see what is an issue, and Dilip agreed and added that enterprise risk views need to be seen by many which was a security issue to be resolved.
Don Wesnofske of PRMIA and Dell said that data quality was another key issue in risk. Dilip agreed and added that the front office need to be involved in this (data management projects are not just for the back office in insolation) and that data quality was one of a number of needs that compete for resources/budget at many banks at the moment. Coming back to his people theme, Dilip also said that data quality also needed intuition to be carried out successfully.
An audience question from Dan Rodriguez (of PRMIA and Credit Suisse) asked whether regulation was granting an advantage to "Too Big To Fail" organisations in that only they have the resources to be able to cope with the ever-increasing demands of the regulators, to the detriment of the smaller financial insitutions. The panel did not completely agree with Dan's premise, arguing that smaller organizations were more agile and did not have the legacy and complexity of the larger institutions, so there was probably a sweet spot between large and small from a regulatory compliance perspective (I guess it was interesting that the panel did not deny that regulation was at least affecting the size of financial institutions in some way...)
Again focussing on where resources should be deployed, the panel debated trade-offs such as those between accuracy and consistency. The Legal Entity Identifier (LEI) initiative was thought of as a great start in establishing standards for data aggregation, and the panel encouraged regulators to look at doing more. One audience question was around the different and inconsistent treatment of gross notional and trade accounts. Dilip said that yes this was an issue, but came back to Stephanie's point that what is needed is a single risk data platform that is flexible enough to be used across multiple business and compliance projects. Don said that he suggests four "views" on risk:
- Risk Taking
- Risk Management
- Risk Measurement
- Risk Regulation
Stephanie added that organisations should focus on the measures that are most appropriate to your business activity.
The next audience question asked whether the panel thought that the projects driven by regulation had a negative return. Dilip said that his experience was yes, they do have negative returns but this was simply a cost of being in business. Unsurprisingly maybe, Stephanie took a different view advocating the benefits side coming out of some of the regulatory projects that drove improvements in data management.
The final audience question was whether the panel through the it was possible to reconcile all of the regulatory initiatives like Dodd-Frank, Basel III, EMIR etc with operational risk. Don took a data angle to this question, taking about the benefits of big data technologies applied across all relevant data sets, and that any data was now potentially valuable and could be retained. Dilip thought that the costs of data retention were continually going down as data volumes go up, but that there were costs in capturing the data need for operational risk and other applications. Dilip said that when compared globally across many industries, financial markets were way behind the data capabilities of many sectors, and that finance was more "Tiny Data" than "Big Data" and again he came back to the fact that people were getting in the way of better data management. Michael said that many banks and market data vendors are dealing with data in the 10's of TeraBytes range, whereas the amount of data in the world was around 8-900 PetaBytes (I thought we were already just over into ZetaBytes but what are a few hundred PetaBytes between friends...).
Abraham closed off the evening, firstly by asking the audience if they thought the 2016 deadline would be achieved by their organisation. Only 3 people out of around 50+ said yes. Not sure if this was simply people's reticence to put their hand up, but when Abraham asked one key concern for many was that the target would change by then - my guess is that we are probably back into the territory of the banks not implementing a regulation because it is too vague, and the regulators not being too prescriptive because they want feedback too. So a big game of chicken results, with the banks weighing up the costs/fines of non-compliance against the costs of implementing something big that they can't be sure will be acceptable to the regulators. Abraham then asked the panel for closing remarks: Don said that data architecture was key; Stephanie suggested getting the strategic aims in place but implementing iteratively towards these aims; Dilip said that deciding your goal first was vital; and Michael advised building a roadmap for data in risk.
Posted by Brian Sentance | 4 November 2013 | 11:47 am
Very pleased to announce our new data integration for TimeSCape with S&P Capital IQ - see the press release
Posted by Brian Sentance | 21 October 2013 | 2:25 pm
Andrew Delaney introduced the final panel of the day, involving Steve Cheng of Rimes, Jonathan Clark of Tech Mahindra, Tom Dalglish of UBS and Martijn Groot of Euroclear. Main points:
- Andrew started by asking the panel for their definitions of managed data services and data utilities
- Martijn said that a managed data service was usually the lifting out of a data process from in a company to be run by somebody else whereas a data utility had many users.
- Tom put it another way saying that a managed service was run for you whereas a utility was run for them. Tom suggested that there were some concerns around data utilities for the industry in terms of knowing/being transparent about data vendor affinity and any data monopoly aspects.
- When asked why past attempts at data utilities had failed, Tom said that it must be frustrating to be right but at wrong time, but in addition to the timing being right just now (costs/regulations being drivers) then the tech stack available is better and the appreciation of data usage importance is clearer.
- Steve added a great point on the tech stack, in that it now made mass customisation much easier.
- Jonathan made the point that past attempts at data utilities were built on product platforms used at clients, whereas the latest utilities were built on platforms specifically designed for use by a data utility.
- Looking at the cost savings of using a data utility, Martijn said that the industry spends around $16-20B on data, and that with his Euroclear data utility they can serve 2000 clients with a staff level that is less than any one client employs directly.
- Tom said that the savings from collapsing the data silos were primarily from more efficient/reduced usage of people and hardware to perform a specific function, and not data.
- Steve suggested that some utilities take an incremental data services and not take all data as in the old utility model, again coming back to his earlier point of mass customisation.
- Tom mentioned it was a bit like cable TV, where you can subscribe to a set of services of your choice but where certain services cost more than others.
- Martijn said that there were too many vested interests to turn data costs around quickly. He said that data utilities could go a long way however.
- Tom concluded by saying that it was about content not feeds, licensing was important as was how to segregate data.
Good panel - additionally one final audience question/discussion was around data utilities providing LEI data, and it was argued that LEI without the hierarchy is just another set of data to map and manage.
Posted by Brian Sentance | 7 October 2013 | 12:28 pm
The first panel of the afternoon touched on a hot topic at the moment, the role of the Chief Data Officer (CDO). Andrew Delaney again moderated the panel, consisting of Rupert Brown of UBS, Patrick Dewald of Diaku, Colin Hall of Credit Suisse, Nigel Matthews of Barclays and Neill Vanlint of GoldenSource. Main points:
- Colin said that the need for the CDO role is that someone needs to sit at the top table who is both nerdy about data but also can communicate a vision for data to the CEO.
- Rupert said that role of CDO was still a bit nebulous covering data conformance, storage management, security and data opportunity (new functionality and profit). He suggested this role used to be called "Data Stewardship" and that the CDO tag is really a rename.
- Colin answered that the role did use to be a junior one, but regulation and the rate of industry change demands a CDO, a point contact for everyone when anything comes up that concerns data - previously nobody knew quite who to speak to on this topic.
- Patrick suggested that a CDO needs a long-term vision for data, since the role is not just an operational one.
- Nigel pointed out that the CDO needs to cover all kinds of data and mentioned recent initiatives like BCBS with their risk data aggregation paper.
- Neil said that he had seen the use of a CDO per business line at some of his clients.
- There was some conversation around the different types of CDO and the various carrots and sticks that can be employed. Neil made the audience laugh with his quote from a client that "If the stick doesn't work, I have a five-foot carrot to hit them with!"
- Patrick said that CDO role is about business not just data.
- Colin picked up on what Patrick said and illustrated this with an example of legal contract data feeding directly into capital calculations.
- Nigel said that the CDO is a facilitator with all departments. He added that the monitoring tools from market data where needed in reference data
Overall good debate, and I guess if you were starting from scratch (if only we could!) you would have to think that the CDO is a key role given the finance industry is primarily built on the flow of data from one organisation to another.
Posted by Brian Sentance | 7 October 2013 | 12:26 pm
Andrew Delaney introduced the second panel of the day, with the long title of "The Industry Response: High Performance Technologies for Data Management - Big Data, Cloud, In-Memory, Meta Data & Big Meta Data". The panel included Rupert Brown of UBS, John Glendenning of Datastax, Stuart Grant of SAP and Pavlo Paska of Falconsoft. Andrew started the panel by asking what technology challenges the industry faced:
- Stuart said that risk data on-demand was a key challenge, that there was the related need to collapse the legacy silos of data.
- Pavlo backed up Stuart by suggesting that accuracy and consistency were needed for all live data.
- Rupert suggested that there has been a big focus on low latency and fast data, but raised a smile from the audience when he said that he was a bit frustrated by the "format fetishes" in the industry. He then brought the conversation back to some fundamentals from his viewpoint, talking about wholeness of data and namespaces/data dictionaries - Rupert said that naming data had been too stuck in the functional area and not considered more in isolation from the technology.
- John said that he thought there were too many technologies around at the moment, particularly in the area of Not Only SQL (NoSQL) databases. John seemed keen to push NoSQL, and in particular Apache Cassandra, as post relational databases. He put forward that these technologies, developed originally by the likes of Google and Yahoo, were the way forward and that in-memory databases from traditional database vendors were "papering over the cracks" of relational database weaknesses.
- Stuart countered John by saying that properly designed in-memory databases had their place but that some in-memory databases had indeed been designed to paper over the cracks and this was the wrong approach, exascerbating the problem sometimes.
- Responding to Andrew's questions around whether cloud usage was more accepted by the industry than it had been, Rupert said he thought it was although concerns remain over privacy and regulatory blockers to cloud usage, plus there was a real need for effective cloud data management. Rupert also asked the audience if we knew of any good release management tools for databases (controlling/managing schema versioning etc) because he and his group were yet to find one.
- Rupert expressed that Hadoop 2 was of more interest to him at UBS that Hadoop, and as a side note mentioned that map reduce was becoming more prevalent across NoSQL not just within the Hadoop domain. Maybe controversially, he said that UBS was using less data than it used to and as such it was not the "big data" organisation people might think it to be.
- As one example of the difficulties of dealing with silos, Stuart said that at one client it required the integration of data from 18 different system to a get an overall view of the risk exposure to one counterparty. Stuart advocated bring the analytics closer to the data, enabling more than one job to be done on one system.
- Rupert thought that Goldman Sachs and Morgan Stanley seem to do what is the right thing for their firm, laying out a long-term vision for data management. He said that a rethink was needed at many organisations since fundamentally a bank is a data flow.
- Stuart picked up on this and said that there will be those organisations that view data as an asset and those that view data as an annoyance.
- Rupert mentioned that in his view accountants and lawyers are getting in the way of better data usage in the industry.
- Rupert added that data in Excel needed to passed by reference and not passed by value. This "copy confluence" was wasting disk space and a source of operational problems for many organisations (a few past posts here and here on this topic).
- Moving on to describe some of the benefits of semantic data and triple stores, Rupert proposed that the statistical world needed to be added to the semantic world to produce "Analytical Semantics" (see past post relating to the idea of "analytics management").
Great panel, lots of great insight with particularly good contributions from Rupert Brown.
Posted by Brian Sentance | 7 October 2013 | 12:23 pm
The first panel of the day opened with an introductory talk by Chris Johnson of HSBC. Chris started his talk by proudly announcing that he drives a Skoda car, something that to him would have been unthinkable 25 years ago but with investment, process and standards things can and will change. He suggested that data management needs to go through a similar transformation, but that there remained a lot to be done.
Moving on to the current hot topics of data unitilities and managed services, he said that reduced costs of managed services only became apparent in the long term and that both types of initiative have historically faced issues with:
- Logistical Challenges and Risks
Chris made the very good point that until service providers accept liability for data quality then this means that clients must always check the data they use. He also mentioned that in relation to Solvency II (a hot topic for Chris at HSBC Security Services), that EIOPA had recently mentioned that managed services may need to be regulated. Chris mentioned the lack of time available to respond to all the various regulatory deadlines faced (a recurring theme) and that the industry still lacked some basic fundamentals such as a standard instrument identifier.
Chris then joined the panel discussion with Andrew Delaney as moderator and with other panelists including Colin Gibson (see previous post), Matt Cox of Denver Perry, Sally Hinds of Data Management Consultancy Services and Robert Hofstetter of Bank J. Safra Sarasin. The key points I took from the panel are outlined below:
- Sally said that many firms were around Level 3 in the Data Management Maturity Model, and that many were struggling particularly with data integration. Sally added that utililities were new, as was the CDO role and that implications for data management were only just playing out.
- Matt thought that reducing cost was an obvious priority in the industry at the moment, with offshoring playing its part but progress was slow. He believed that data management remains underdeveloped with much more to be done.
- Colin said that organisations remain daunted by their data management challenges and said that new challenges for data management with transactional data and derived data.
- Sally emphasised the role of the US FATCA regulation and how it touches upon some many processess and departments including KYC, AML, Legal, Tax etc.
- Matt highlighted derivatives regulation with the current activity in central clearing, Dodd-Frank, Basel III and EMIR.
- Chris picked up on this and added Solvency II into the mix (I think you can sense regulation was a key theme...). He expressed the need and desirability of a Unique Product Identifier (UPI see report) as essential for the financial markets industry and how we need not just stand still now the LEI was coming. He said that industry associations really needed to pick up their game to get more standards in place but added that the IMA had been quite proactive in this regard. He expressed his frustration at current data licensing arrangements with data vendors, with the insistence on a single point of use being the main issue (big problem if you are in security services serving your clients I guess)
- Robert added that his main issues were data costs and data quality
- Andrew then brought the topic around to risk management and its impact on data management.
- Colin suggested that more effort was needed to understand the data needs of end users within risk management. He also mentioned that products are not all standard and data complexity presents problems that need addressing in data management.
- Chris mentioned that there 30 data fields used in Solvency II calculations and that if any are wrong this would have a direct impact on the calcualated capital charge (i.e. data is important!)
- Colin got onto the topic of unstructured data and said how it needed to be tagged in some way to become useful. He suggested that there was an embrionic cross-over taking place between structured and unstructured data usage.
- Sally thought that the merging of Business Intelligence into Data Management was a key development, and that if you have clean data then use it as much as you can.
- Robert thought that increased complexity in risk management and elsewhere should drive the need for increased automation.
- Colin thought cost pressures mean that the industry simply cannot afford the old IT infrastructure and that architecture needs to be completely rethought.
- Chris said that we all need to get the basics right, with LEI but then on to UPI. He said to his knowledge data management will always be a cost centre and standardisation was a key element of reducing costs across the industry.
- Sally thought that governance and ownership of data was wooly at many organisations and needed more work. She added this needed senior sponsorship and that data management was an ongoing process, not a one-off project.
- Matt said that the "stick" was very much needed in addition to the carrot, advising that the proponents of improved data management should very much lay out the negative consequences to bring home the reality to business users who might not see the immediate benefits and costs.
Overall good panel, lots of good debate and exchanging of ideas.
Posted by Brian Sentance | 7 October 2013 | 12:17 pm
Great day on Thursday at the A-Team Data Management Summit in London (personally not least because Xenomorph won the Best Risk Data Management/Analytics Platform Award but more of that later!). The event kicked off with a brief intro from Andrew Delaney of the A-Team talking through some of the drivers behind the current activity in data management, with Andrew saying that risk and regulation were to the fore. Andrew then introduced Colin Gibson, Head of Data Architecture, Markets Division at Royal Bank of Scotland.
Data Architecture - Sticks or Carrots? Colin began by looking at the definition of "data architecture" showing how the definition on Wikipedia (now obviously the definitive source of all knowledge...) was not particularly clear in his view. He suggested himself that data architecture is composed of two related frameworks:
- Orderly Arrangement of Parts
He said that the orderly arrangement of parts is focussed on business needs and aims, covering how data is sourced, stored, referenced, accessed, moved and managed. On the discipline side, he said that this covered topics such as rules, governance, guides, best practice, modelling and tools.
Colin then put some numbers around the benefits of data management, saying that for every dollar spend on centralising data saves 20 dollars, and mentioning a resulting 80% reduction in operational costs. Related to this he said that for every dollar spent on not replicating data saved a dollar on reconcilliation tools and a further dollar saved on the use of reconcilliation tools (not sure how the two overlap but these are obviously some of the "carrots" from the title of the talk).
Despite these incentives, Colin added that getting people to actually use centralised reference data remains a big problem in most organisations. He said he thought that people find it too difficult to understand and consume what is there, and faced with a choice they do their own thing as an easier alternative. Colin then talked about a program within RBS called "GoldRush" whereby there is a standard data management library available to all new projects in RBS which contains:
- messaging standards
- standard schema
- update mechanisms
The benefit being that if the project conforms with the above standards then they have little work to do for managing reference data since all the work is done once and centrally. Colin mentioned that also there needs to be feedback from the projects back to central data management team around what is missing/needing to be improved in the library (personally I would take it one step further so that end-users and not just IT projects have easy discovery and access to centralised reference data). The lessons he took from this were that we all need to "learn to love" enterprise messaging if we are to get to the top down publish once/consume often nirvana, where consuming systems can pick up new data and functionality without significant (if any) changes (might be worth a view of this post on this topic). He also mentioned the role of metadata in automating reconcilliation where that needed to occur.
Colin then mentioned that allocation of costs of reference data to consumers is still a hot topic, one where reference data lags behind the market data permissioning/metering insisted upon by exchanges. Related to this Colin thought that the role of the Chief Data Officer to enforce policies was important, and the need for the role was being driven by regulation. He said that the true costs of a tactical, non-standard approach need to be identifiable (quantifying the size of the stick I guess) but that he had found it difficult to eliminate the tactical use of pricing data sourced for the front office. He ended by mentioning that there needs to be a coming together of market data and reference data since operations staff are not doing quantitative valuations (e.g. does the theoretical price of this new bond look ok?) and this needs to be done to ensure better data quality and increased efficiency (couldn't agree more, have a look at this article and this post for a few of my thoughts on the matter). Overall very good speaker with interesting, practical examples to back up the key points he was trying to get across.
Posted by Brian Sentance | 7 October 2013 | 12:12 pm
Great event from PRMIA on Macro Stress Testing at Moody's last night. A few quick highlights:
- The role of the regulators is now not only to be sure that banks have enough capital to withstand a severe downtown, but that the banks have enough capital once the downturn has happened.
- The Fed have a new whitepaper coming out in July on "Effective Capital Adequacy Process" that covers 7 different aspects from risk management foundations through to governance.
- CCAR stress tests are thought by regulators to be easier to understand (e.g. this happens we get this loss) rather VAR/risk sensitivities that do not capture tail risk.
- Hedges that do not behave as hedges under times of stress are a major area of concern.
- Assumptions of the stress tests such as the second half of 2008 occuring instantaneously to the trading book is not reasonable/representative but hard to come up with credible/pragmatic alternatives.
- One of the speakers put forward the following lists of positives about the stress tests:
- Restoration of market/public confidence in banks
- Determination of the appropriate levels of capital adequacy
- Understanding of risk profile
- Identification of tail risks
- Curbing of risk taking
- Incentivising behaviours
- Whilst banks and regulators are often in conflict over capital adequacy, banks do implement their own internal stress tests and do have a commercial interest in doing this well.
- One panelist said that "the best hedge is to sell"
- Some banks have switched accountancy standards to game capital requirements, and there was some later debate that Risk Weighted Assets were a controversial part of the calculations when analyzed against the NYU Stern V-Lab stress testing.
- There is a danger that CCAR and stress testing drives or becomes an industry in itself, which is not good for markets, the banking system or the economy as a whole.
- There was some debate about liquidity risk as it relates to solvency, and that it should be much more integrated with the stress tests. The panel expressed interest at the forthcoming CLAR stress tests and how it relates to CCAR.
- The panel thought that the Federal Reserve is effectively challenging each bank to understand its own balance sheet better than the Fed can.
- Given the state of systems and data management at many banks, this was a big challenge.
- The panel thought that more open access to the data regulators are collecting would be great for academics to analyze given some of the big data technologies available to analyze such large datasets.
- One speaker put forward that only a subidized industry such as banking could an industry afford to treat data so poorly.
Great event, knowledgeable speakers with strong opinions and good wine/food afterwards (thanks Moody's!).
Posted by Brian Sentance | 26 June 2013 | 4:51 pm
Guest post today from Matthew Berry of Bedrock Valuation Advisors, discussing Libor vs OIS based rate benchmarks. Curves and curve management are a big focus for Xenomorph's clients and partners, so great that Matthew can shed some further light on the current debate and its implications:
New Benchmark Proposal’s Significant Implications for Data Management
During the 2008 financial crisis, problems posed by discounting future cash flows using Libor rather than the overnight index swap (OIS) rate became apparent. In response, many market participants have modified systems and processes to discount cash flows using OIS, but Libor remains the benchmark rate for hundreds of trillions of dollars worth of financial contracts. More recently, regulators in the U.S. and U.K. have won enforcement actions against several contributors to Libor, alleging that these banks manipulated the benchmark by contributing rates that were not representative of the market, and which benefitted the banks’ derivative books of business.
In response to these allegations, the CFTC in the U.S. and the Financial Conduct Authority (FCA) in the U.K. have proposed changes to how financial contracts are benchmarked and how banks manage their submissions to benchmark fixings. These proposals have significant implications for data management.
The U.S. and U.K. responses to benchmark manipulation
In April 2013, CFTC Chairman Gary Gensler delivered a speech in London in which he suggested that Libor should be retired as a benchmark. Among the evidence he cited to justify this suggestion:
- Liquidity in the unsecured inter-dealer market has largely dried up.
- The risk implied by contributed Libor rates has historically not agreed with the risk implied by credit default swap rates. The Libor submissions were often stale and did not change, even if the entity’s CDS spread changed significantly. Gensler provided a graph to demonstrate this.
Gensler proposed to replace Libor with either the OIS rate or the rate paid on general collateral repos. These instruments are more liquid and their prices more readily-observable in the market. He proposed a period of transition during which Libor is phased out while OIS or the GC repo rate is phased in.
In the U.K., the Wheatley Report provided a broad and detailed review of practices within banks that submit rates to the Libor administrator. This report found a number of deficiencies in the benchmark submission and calculation process, including:
- The lack of an oversight structure to monitor systems and controls at contributing banks and the Libor administrator.
- Insufficient use of transacted or otherwise observable prices in the Libor submission and calculation process.
The Wheatley Report called for banks and benchmark administrators to put in place rigorous controls that scrutinize benchmark submissions both pre and post publication. The report also calls for banks to store an historical record of their benchmark submissions and for benchmarks to be calculated using a hierarchy of prices with preference given to transacted prices, then prices quoted in the market, then management’s estimates.
Implications for data management
The suggestions for improving benchmarks made by Gensler and the Wheatley Report have far-reaching implications for data management.
If Libor and its replacement are run in parallel for a time, users of these benchmark rates will need to store and properly reference two different fixings and forward curves. Without sufficiently robust technology, this transition period will create operational, financial and reputational risk given the potential for users to inadvertently reference the wrong rate. If Gensler’s call to retire Libor is successful, existing contracts may need to be repapered to reference the new benchmark. This will be a significant undertaking. Users of benchmarks who store transaction details and reference rates in electronic form and manage this data using an enterprise data management platform will mitigate risk and enjoy a lower cost to transition.
Within the submitting banks and the benchmark administrator, controls must be implemented that scrutinize benchmark submissions both pre and post publication. These controls should be exceptions-based and easily scripted so that monitoring rules and tolerances can be adapted to changing market conditions. Banks must also have in place technology that defines the submission procedure and automatically selects the optimal benchmark submission. If transacted prices are available, these should be submitted. If not, quotes from established market participants should be submitted. If these are not available, management should be alerted that it must estimate the benchmark rate, and the decision-making process around that estimate should be documented.
These improvements to the benchmark calculation process will, in Gensler’s words, “promote market integrity, as well as financial stability.” Firms that effectively utilize data management technology, such as Xenomorph's TimeScape, to implement these changes will manage the transition to a new benchmark regime at a lower cost and with a higher likelihood of success.
Posted by Brian Sentance | 25 June 2013 | 1:32 pm
Numerix ran a great event on Thursday morning over at Microsoft's offices here in New York. "The Road to Achieving a Unified View of Risk" was introduced by Paul Rowady of the TABB Group. As at our holiday event last December, Paul is a great speaker and trying to get him to stop talking is the main (positive) problem of working with him (his typical ebullience was also heightened by his appearance in the Wall Street Journal on Thursday, apparently involving nothing illegal he assured me and even about which his mother phoned him during his presentation...). Paul started by saying that in their end of year review with his colleagues Larry Tabb and Adam Sussman, he suggested that Tabb Group needed to put more into developing the risk management thought leadership, which had led to today's introduction and the work Tabb Group have been doing with Numerix.
Having been involved in financial markets in Chicago, Paul is very bullish about the risk management capabilities of the funds and prop trading shops of the exchange traded options markets from days of old, and said that these risk management capabilities are now needed and indeed coming to the mainstream financial markets. Put another way, post crisis the need for a holistic view on risk has never been stronger. Considering bilateral OTC derivatives and the move towards central clearing, Paul said that he had been thinking that calculations such as CVA would eventually become as extinct as a dodo. However on using some data from the DTCC trade repository, he found that there are still some $65trillion notional of uncleared bilateral trades in the market, and that these will take a further 30 years to expire. Looking at swaptions alone the notional uncleared was $6trillion, and so his point was that bilateral OTC and their associated risks will be around for some time yet.
Paul put forward some slides showing back, middle and front-offices along different siloed business lines, and explained that back in the day when margins were fat and times were good, each unit could be run independently, with no overall view of risk possible given the range of siloed systems and data. In passing Paul also mentioned that one bank he had spoken two had 6,000 separate systems to support on just the banking side, let alone capital markets. Obviously post crisis this has changed, with pressures to reduce operational costs being a key driver at many institutions, and currently only valuation/reference data (+2.4%) and risk management (+1.2%) having increased budget spend across the market in 2013. Given operational costs and regulation such as CVA, risk management is having to move from being an end of day, post-trade process to being pre- and post-trade at intraday frequency. Paul said that not only must consistent approaches to data and analytics be taken across back, middle and front office in each business unit but now an integrated view of risk across business units must be taken (echos of an earlier event with Numerix and PRMIA). Considering consistent analytics, Paul mentioned his paper "The Risk Analytics Library" but suggested that "libraries" of everything were needed, so not just analytics, but libraries of data (data management anyone?), metadata, risk models etc.
Paul asked Ricardo Martinez of Deloite for an update on the regulatory landscape at the moment, and Ricardo responded by focusing down on the derivatives aspects Dodd-Frank. He first pointed out that even after a number of years the regulation was not yet finalized around collateral and clearing. A good point he made was that whilst the focus in the market at the moment is on compliance, he feels that the consequences of the regulation will ripple on over the next 5 years in terms of margining and analytics.
Some panel members disagreed with Paul over the premise that bilateral exotic trades will eventually disappear. Their point was that the needs of pension funds and other clients are very specific and there will always be a need for structured products, despite the capital cost incentives to move everything onto exchanges/clearing. Paul countered by saying that he didn't disagree with this, but the reason for suggesting that the exotics industry may die is trying to find institutions that can warehouse the risk of the trade.
Satyam Kancharla of Numerix spoke next. Satyam said that two main changes struck him in the market at the moment. One was the adjustment to a mandated market structure with clearing, liquidity and capital changes coming through from the regulators. The other was increased operating efficiency for investment banks. Whilst it is probable that no in investment bank would ever get to the operational efficiency of a retail business like Walmart, this was however the direction of travel with banks looking at how to optimize collateral, optimize trading venues etc.
Satyam put forward that computing power is still adhering to Moore's law, and that as a result some things are possible now that were not before, and that a centralized architecture built on this compute power is needed, but just because it is centralized does not mean that it is too inflexible to deal with each business units needs. Coming back to earlier comments made by the panel, he put forward that a lot of quants are involved in simply re-inventing the wheel, to which Paul added that quants were very experienced in using words like "orthogonal" to confuse mere mortals like him and justify the repetition of business functionality available already (from Numerix obviously, but more of that later). Satyam said that some areas of model development were more mature than others, and that quants should not engage in innovation for innovation's sake. Satyam also made a passing reference to the continuing use of Excel and VBA is the main tool of choice in the front office, suggesting that we still have some way to go in terms of IT maturity (hobby-horse topic of mine, for example see post).
Prompt by an audience question around data and analytics, Ricardo said that the major challenge towards sharing data was not technical but cultural. Against a background were maybe 50% of investment in technology was regulation-related, he said that there were no shortage of business ideas for P&L in the emerging "mandated" markets of the future, but many of these ideas required wholesale shifts in attitudes at the banks in terms of co-operation across departments and from front to back office.
Satyam said that he thought of data and analytics as two sides of the same coin (could not agree more, but then again I would say that) in that analytics generate derived data which needs just as much management as the raw data. He said that it should be possible to have systems and architectures that manage the duality of data and analytics well, and these architectures did not have to imply rigidity and inflexibility in meeting individual business needs.
There was then some debate of trade repositories for derivatives, where the panel discussed the potential conflict between the US regulators wanting competition in this area, but as Paul suggested having competition between DTCC, ICE, Bloomberg, LCH Clearnet etc also led to fragmentation. As such Paul put it that the regulators would need to "boil the ocean" to understand the exposures in the market. Ricardo also mentioned some of the current controversy over who owns the data in the trade repository. One of the panelists suggested that we should also keep an eye open to China and not necessarily get totally tied up in what is happening in "our" markets. The main point was that a huge economy such as China's could not survive without a sophisticated capital market to support it, and that China was not asleep in this regard.
A good audience question came from Don Wesnofske who asked how best to cope with the situation where an institution is selling derivatives based on one set of models, and the client is using another set of models to value the same trade. So the selling institution decides to buy/build a similar model to the client too, and Don wondered how the single analytic library practically helped this situation where I could price on one model and report my P&L using another. One panelist responded that it was mostly the assumptions behind each model that determined differences in price, and that heterogenious models and hence prices where needed for a market to function correctly. Another concurred on this and suggested there needed to be an "officially blessed" model with an institution against which valuations are compared. Amusingly for the audience, Steve O'Hanlon (CEO of Numerix) piped up that the problem was easy to resolve in that everyone should use Numerix's models.
Mike Opal of Microsoft closed the event with his presentation on data, analytics and cloud computing. Mike started by illustrating that the number of internet-enabled devices passed the human population of the world in 2008 and by 2020 the number of devices would be 50 billion. He showed that the amount of data in the world was 0.8ZB (zetabytes) in 2009, and is projected to reach 8ZB by 2015 and 35ZB by 2020, driven primarily by the growth in internet-enabled devices. Mike also said that the Prism project so in the news of late was involving the construction of a server fame near Salt Lake City of 5ZB in size, so what the industry (in this case the NSA) is trying to do is unimaginable if we were to go back only a few years. He said that Microsoft itself was utterly committed to cloud computing, with 8 datacenters globally but 20 more in construction, at a cost of $500million per center (I recently saw a datacentre in Redmond, totally unlike what I expected with racks pre-housed in lorry containers, and the containers just unloaded within a gigantic hanger and plugged in - the person showing me around asked me who the busiest person was a Microsoft data center and the answer was the truck drivers...)
Talking of "Big Data", he first gave the now-standard disclaimer (as I have I acknowledge) that he disliked the phrase. I thought he made a good point in the Big Data is really about "Small Data", in that a lot of it is about having the capacity to analyze at tiny granular level within huge datasets (maybe journalists will rename it? No, don't think so). He gave a couple of good client case studies, one for Westpac and one for Phoenix on uses of HPC and cloud computing in financial services. He also mentioned the Target retailing story about Big Data, which if you haven't caught it is worth a read. One audience question asked him again how committed Microsoft was to cloud computing given competition from Amazon, Apple and Google. Mike responded that he had only joined Microsoft a year or two back, and in part this was because he believed Microsoft had to succeed and "win" the cloud computing market given that cloud was not the only way to go for these competitors, whereas Microsoft (being a software company) had to succeed at cloud (so far Microsoft have been very helpful to us in relation to Azure, but I guess Amazon and others have other plans.)
In summary a great event from Numerix with good discussions and audience interaction - helped for me by the fact that much of what was said (centralization with flexibility, duality of data and analytics, libraries of everything etc) fits with what Xenomorph and partners like Numerix are delivering for clients.
Posted by Brian Sentance | 17 June 2013 | 8:23 pm
There are (occasionally!) some good questions and conversations going on within some of the LinkedIn groups. One recently was around what use cases there are for unstructured data within banking and finance, and I found this comment from Tom Deutsch of IBM to be quite insightful and elegant (at least better than I could I have written it...) on what the main types of unstructured data analysis there are:
- Listening for the first time
- Listening better
- Adding context
Listening for the first time is really just making use of what you already probably capture to hear what is being said (or navigated)
Listening better is making sure you are actually both hearing and understanding what is being said. This is sometimes non-trivial as it involves accuracy issues and true (not marketing hype) NLP technologies and integrating multiple sources of information
Adding context is when you either add structured data to the above or add the above to structured data, usually to round out or more fully inform models (or sometimes just build new ones).
Posted by Brian Sentance | 10 May 2013 | 2:17 pm
I went over to NYU Poly in Brooklyn on Friday of last week for their Big Data Finance Conference. To get a slightly negative point out of the way early, I guess I would have to pose the question "When is a big data conference, not a big data Conference?". Answer: "When it is a time series analysis conference" (sorry if you were expecting a funny answer...but as you can see, then what I occupy my time with professionally doesn't naturally lend itself to too much comedy). As I like time series analysis, then this was ok, but certainly wasn't fully "as advertised" in my view, but I guess other people are experiencing this problem too.
Maybe this slightly skewed agenda was due to the relative newness of the topic, the newness of the event and the temptation for time series database vendors to jump on the "Big Data" marketing bandwagon (what? I hear you say, we vendors jumping on a buzzword marketing bandwagon, never!...). Many of the talks were about statistical time series analysis of market behaviour and less about what I was hoping for, which was new ways in which empirical or data-based approaches to financial problems might be addressed through big data technologies (as an aside, here is a post on a previous PRMIA event on big data in risk management as some additional background). There were some good attempts at getting a cross-discipline fertilization of ideas going at the conference, but given the topic then representatives from the mobile and social media industries were very obviously missing in my view.
So as a complete counterexample to the two paragraphs above, the first speaker (Kevin Atteson of Morgan Stanley) at the event was on very much on theme with the application of big data technologies to the mortgage market. Apparently Morgan Stanley had started their "big data" analysis of the mortgage market in 2008 as part of a project to assess and understand more about the potential losses than Fannie Mae and Freddie Mac faced due to the financial crisis.
Echoing some earlier background I had heard on mortgages, one of the biggest problems in trying to understand the market according to Kevin was data, or rather the lack of it. He compared mortgage data analysis to "peeling an onion" and that going back to the time of the crisis, mortgage data at an individual loan level was either not available or of such poor quality as to be virtually useless (e.g. hard to get accurate ZIP code data for each loan). Kevin described the mortgage data set as "wide" (lots of loans with lots of fields for each loan) rather than "deep" (lots of history), with one of the main data problems was trying to match nearest-neighbour loans. He mentioned that only post crisis have Fannie and Freddie been ordered to make individual loan data available, and that there is still no readily available linkage data between individual loans and mortgage pools (some presentations from a recent PRMIA event on mortgage analytics are at the bottom of the page here for interested readers).
Kevin said that Morgan Stanley had rejected the use of Hadoop, primarily due write through-put capabilities, which Kevin indicated was a limiting factor in many big data technologies. He indicated that for his problem type that he still believed their infrastructure to be superior to even the latest incarnations of Hadoop. He also mentioned the technique of having 2x redundancy or more on the data/jobs being processed, aimed not just at failover but also at using the whichever instance of a job that finished first. Interestingly, he also added that Morgan Stanley's infrastructure engineers have a policy of rebooting servers in the grid even during the day/use, so fault tolerance was needed for both unexpected and entirely deliberate hardware node unavailability.
Other highlights from the day:
- Dennis Shasha had some interesting ideas on using matrix algebra for reducing down the data analysis workload needed in some problems - basically he was all for "cleverness" over simply throwing compute power at some data problems. On a humourous note (if you are not a trader?), he also suggested that some traders had "the memory of a fruit-fly".
- Robert Almgren of QuantitativeBrokers was an interesting speaker, talking about how his firm had done a lot of analytical work in trying to characterise possible market responses to information announcements (such as Friday's non-farm payroll announcement). I think Robert was not so much trying to predict the information itself, but rather trying to predict likely market behaviour once the information is announced.
- Scott O'Malia of the CFTC was an interesting speaker during the morning panel. He again acknowledged some of the recent problems the CFTC had experienced in terms of aggregating/analysing the data they are now receiving from the market. I thought his comment on the twitter crash was both funny and brutally pragmatic with him saying "if you want to rely solely upon a single twitter feed to trade then go ahead, knock yourself out."
- Eric Vanden Eijnden gave an interesting talk on "detecting Black Swans in Big Data". Most of the examples were from current detection/movement in oceanography, but seemed quite analogous to "regime shifts" in the statistical behaviour of markets. Main point seemed to be that these seemingly unpredictable and infrequent events were predictable to some degree if you looked deep enough in the data, and in particular that you could detect when the system was on a possible likely "path" to a Black Swan event.
One of the most interesting talks was by Johan Walden of the Haas Business School, on the subject of "Investor Networks in the Stock Market". Johan explained how they had used big data to construct a network model of all of the participants in the Turkish stock exchange (both institutional and retail) and in particular how "interconnected" each participant was with other members. His findings seemed to support the hypothesis that the more "interconnected" the investor (at the centre of many information flows rather than add the edges) the more likely that investor would demonstrate superior return levels to the average. I guess this is a kind of classic transferral of some of the research done in social networking, but very interesting to see it applied pragmatically to financial markets, and I would guess an area where a much greater understanding of investor behaviour could be gleaned. Maybe Johan could do with a little geographic location data to add to his analysis of how information flows.
So overall a good day with some interesting talks - the statistical presentations were challenging to listen to at 4pm on a Friday afternoon but the wine afterwards compensated. I would also recommend taking a read through a paper by Charles S. Tapiero on "The Future of Financial Engineering" for one of the best discussions I have so far read about how big data has the potential to change and improve upon some of the assumptions and models that underpin modern financial theory. Coming back to my starting point in this post on the content of the talks, I liked the description that Charles gives of traditional "statistical" versus "data analytics" approaches, and some of the points he makes about data immediately inferring relationships without the traditional "hypothesize, measure, test and confirm-or-not" were interesting, both in favour of data analytics and in cautioning against unquestioning belief in the findings from data (feels like this post from October 2008 is a timely reminder here). With all of the hype and the hope around the benefits of big data, maybe we would all be wise to remember this quote by a certain well-known physicist: "No amount of experimentation can ever prove me right; a single experiment can prove me wrong."
Posted by Brian Sentance | 7 May 2013 | 1:46 pm
Background - I went along to my first PRMIA event in Stamford, CT last night, with the rather grandiose title of "The Anthropology, Sociology, and Epistemology of Risk". Stamford is about 30 miles north of Manhattan and is the home to major offices of a number of financial markets companies such as Thomson Reuters, RBS and UBS (who apparently have the largest column-less trading floor in the world at their Stamford headquarters - particularly useful piece of trivia for you there...). It also happens to be about 5 minutes drive/train journey away from where I now live, so easy for me to get to (thanks for another useful piece of information I hear you say...). Enough background, more on the event which was a good one with five risk managers involved in an interesting and sometimes philosophical discussion on fundamentally what "risk management" is all about.
Introduction - Marc Groz who heads the Stamford Chapter of PRMIA introduced the evening and started by thanking Barry Schwimmer for allowing PRMIA to use the Stamford Innovation Centre (the Old Town Hall) for the meeting. Henrik Neuhaus moderated the panel, and started by outlining the main elements of the event title as a framework for the discussion:
- Anthropology - risk management is to what purpose?
- Sociology - how does risk management work?
- Epistemology - what knowledge is really contained within risk management?
Henrik started by taking a passage about anthropology and replacing human "development" with "risk management" which seemed to fit ok, although the angle I was expecting was much more about human behaviour in risk management than where Henrik started. Henrik asked the panel what results they had seen from risk management and what did that imply about risk management? The panelists seemed a little confused or daunted by the question prompting one of them to ask "Is that the question?".
Business Model and Risk Culture - Elliot Noma dived in by responding that the purpose of risk management obviously depended very much on what are the institutional goals of the organization. He said that it was as much about what you are forced to do and what you try to do in risk management. Elliot said that the sell-side view of risk management was very regulatory and capital focused, whereas mutual funds are looking more at risk relative to benchmarks and performance attribution. He added that in the alternatives (hedge-fund) space then there were no benchmarks and the focus was more about liquidity and event risk.
Steve Greiner said that it was down to the investment philosophy and how risk is defined and measured. He praised some asset managers where the risk managers sit across from the portfolio managers and are very much involved in the decision making process.
Henrik asked the panel whether any of the panel had ever defined a “mission statement” for risk management. Marc Groz chipped in that he remember that he had once defined one, and that it was very different from what others in the institution were expecting and indeed very different from the risk management that he and his department subsequently undertook.
Mark Szycher (of GM Pension Fund) said that risk management split into two areas for him, the first being the symmetrical risks where you need to work out the range of scenarios for a particular trade or decision being taken. The second was the more asymmetrical risks (i.e. downside only) such as those found in operational risk where you are focused on how best to avoid them happening.
Micro Risk Done Well - Santa Federico said that he had experience of some of the major problems experienced at institutions such as Merrill Lynch, Salomen Brothers and MF Global, and that he thought risk management was much more of a cultural problem than a technical one. Santa said he thought that the industry was actually quite good at the micro (trade, portfolio) risk management level, but obviously less effective at the large systematic/economic level. Mark asked Santa what was the nature of the failures he had experienced. Santa said that the risks were well modeled, but maybe the assumptions around macro variables such as the housing market proved to be extremely poor.
Keep Dancing? - Henrik asked the panel what might be done better? Elliot made the point that some risks are just in the nature of the business. If a risk manager did not like placing a complex illiquid trade and the institution was based around trading in illiquid markets then what is a risk manager to do? He quote the Citi executive who said “ whilst the music is still playing we have to dance”. Again he came back to the point that the business model of the institution drives its cultural and the emphasis of risk management (I guess I see what Elliot was saying but taken one way it implied that regardless of what was going on risk management needs to fit in with it, whereas I am sure that he meant that risk managers must fit in with the business model mandated to shareholders).
Risk Attitudes in the USA - Mark said that risk managers need to recognize that the improbable is maybe not so improbable and should be more prepared for the worst rather than risk management under “normal” market and institutional behavior. Steven thought that a cultural shift was happening, where not losing money was becoming as important to an organization as gaining money. He said that in his view, Europe and Asia had a stronger risk culture than in the United States, with much more consensus, involvement and even control over the trading decisions taken. Put another way, the USA has more of a culture of risk taking than Europe. (I have my own theories on this. Firstly I think that the people are generally much more risk takers in the USA than in UK/Europe, possibly influenced in part by the relative lack of underlying social safety net – whilst this is not for everyone, I think it produces a very dynamic economy as a result. Secondly, I do not think that cultural desire in the USA for the much admired “presidential” leader necessarily is the best environment for sound, consensus based risk management. I would also like to acknowledge that neither of my two points above seem to have protected Europe much from the worst of the financial crisis, so it is obviously a complex issue!).
Slaves to Data? - Henrik asked whether the panel thought that risk managers were slaves to data? He expanded upon this by asking what kinds of firms encourage qualitative risk management and not just risk management based on Excel spreadsheets? Santa said that this kind of qualitative risk management occurred at a business level and less so at a firm wide level. In particular he thought this kind of culture was in place at many hedge funds, and less so at banks. He cited one example from his banking career in the 1980's, where his immediate boss was shouted off the trading floor by the head of desk, saying that he should never enter the trading floor again (oh those were the days...).
Sociology and Credibility - Henrik took a passage on the historic development of women's rights and replaced the word "women" with "risk management" to illustrate the challenges risk management is facing with trying to get more say and involvement at financial institutions. He asked who should the CRO report to? A CEO? A CIO? Or a board member? Elliot responded by saying this was really a issue around credibility with the business for risk managers and risk management in general. He made the point that often Excel and numbers were used to establish credibility with the business. Elliot added that risk managers with trading experience obviously had more credibility, and to some extent where the CRO reported to was dependent upon the credibility of risk management with the business.
Trading and Risk Management Mindsets - Elliot expanded on his previous point by saying that the risk management mindset thinks more in terms of unconditional distributions and tries to learn from history. He contrasted this with a the "conditional mindset' of a trader, where the time horizon forwards (and backwards) is rarely longer than a few days and the belief is strong that a trade will work today given it worked yesterday is high. Elliot added that in assisting the trader, the biggest contribution risk managers can make is more to be challenging/helpful on the qualitative side rather than just quantitative.
Compensation and Transactions - Most of the panel seemed to agree that compensation package structure was a huge influencer in the risk culture of an organisation. Mark touched upon a pet topic of mine, which is that it very hard for a risk manager to gain credibility (and compensation) when what risk management is about is what could happen as opposed to what did happen. A risk manager blocking a trade due to some potentially very damaging outcomes will not gain any credibility with the business if the trading outcome for the suggested trade just happened to come out positive. There seemed to be concensus here that some of the traditional compensation models that were based on short-term transactional frequency and size were ill-formed (given the limited downside for the individual), and whilst the panel reserved judgement on the effectiveness of recent regulation moves towards longer-term compensation were to be welcome from a risk perspective.
MF Global and Busines Models - Santa described some of his experiences at MF Global, where Corzine moved what was essentially a broker into taking positions in European Sovereign Bonds. Santa said that the risk management culture and capabilities were not present to be robust against senior management for such a business model move. Elliot mentioned that he had been courted for trades by MF Global and had been concerned that they did not offer electronic execution and told him that doing trades through a human was always best. Mark said that in the area of pension fund management there was much greater fidiciary responsibility (i.e. behave badly and you will go to jail) and maybe that kind of responsibility had more of a place in financial markets too. Coming back to the question of who a CRO should report to, Mark also said that questions should be asked to seek out those who are 1) less likely to suffer from the "agency" problem of conflicts of interest and on a related note those who are 2) less likely to have personal biases towards particular behaviours or decisions.
Santa said that in his opinion hedge funds in general had a better culture where risk management opinions were heard and advice taken. Mark said that risk managers who could get the business to accept moral persuasion were in a much stronger position to add value to the business rather than simply being able to "block" particular trades. Elliot cited one experience he had where the traders under his watch noticed that a particular type of trade (basis trades) did not increase their reported risk levels, and so became more focussed on gaming the risk controls to achieve high returns without (reported) risk. The panel seemed to be in general agreement that risk managers with trading experience were more credible with the business but also more aware of the trader mindset and behaviors.
Do we know what we know? - Henrik moved to his third and final subsection of the evening, asking the panel whether risk managers really know what they think they know. Elliot said that traders and risk managers speak a different language, with traders living in the now, thinking only of the implications of possible events such as those we have seen with Cyprus or the fiscal cliff, where the risk management view was much less conditioned and more historical. Steven re-emphasised the earlier point that risk management at this micro trading level was fine but this was not what caused events such as the collapse of MF Global.
Rational argument isn't communication - Santa said that most risk managers come from a quant (physics, maths, engineering) background and like structured arguments based upon well understood rational foundations. He said that this way of thinking was alien to many traders and as such it was a communication challenge for risk managers to explain things in a way that traders would actually put some time to considering. On the modelling side of things, Santa said that sometimes traders dismissed models as being "too quant" and sometimes traders followed models all too blindly without questioning or understanding the simplifying assumptions they are based on. Santa summarised by saying that risk management needs to intuitive for traders and not just academically based. Mark added that a quantitative focus can sometimes become too narrow (modeler's manifesto anyone?) and made the very profound point that unfortunately precision often wins over relevance in the creation and use of many models. Steven added that traders often deal with absolutes, so as knowing the spread between two bonds to the nearest basis point, whereas a risk manager approaching them with a VaR number really means that this is the estimated VaR which really should be thought to be within a range of values. This is alien to the way traders think and hence harder to explain.
Unanticipated Risk - An audience member asked whether risk management should focus mainly on unanticipated risks rather than "normal' risks. Elliot said that in his trading he was always thinking and checking whether the markets were changing or continuing with their recent near-term behaviour patterns. Steven said that history was useful to risk management when markets were "normal", but in times of regime shifts this was not the case and cited the example of the change in markets when Mario Dragi announced that the ECB would stand behind the Euro and its member nations.
Risky Achievements - Henrik closed the panel by asking each member what they thought was there own greatest achievement in risk management. Elliot cited a time when he identified that a particular hedge fund had a relatively inconspicuous position/trade that he identified as potentially extremely dangerous and was proved correct when the fund closed down due to this. Steven said he was proud of some good work he and his team did on stress testing involving Greek bonds and Eurozone. Santa said that some of the work he had done on portfolio "risk overlays" was good. Mark ended the panel by saying that he thought his biggest achievement was when the traders and portfolio managers started to come to the risk management department to ask opinions before placing key trades. Henrik and the audience thanked the panel for their input and time.
An Insured View - After the panel closed I spoke with an actuary who said that he had greatly enjoyed the panel discussions but was surprised that when talking of how best to support the risk management function in being independent and giving "bad" news to the business, the role of auditors were not mentioned. He said he felt that auditors were a key support to insurers in ensuring any issues were allowed to come to light. So food for thought there as to whether financial markets can learn from other industry sectors.
Summary - great evening of discussion, only downside being the absence of wine once the panel had closed!
Posted by Brian Sentance | 25 April 2013 | 9:27 pm
Good post from Jim Jockle over at Numerix - main theme is around having an "analytics" strategy in place in addition to (and probably as part of) a "Big Data" strategy. Fits strongly around Xenomorph's ideas on having both data management and analytics management in place (a few posts on this in the past, try this one from a few years back) - analytics generate the most valuable data of all, yet the data generated by analytics and the input data that supports analytics is largely ignored as being too business focussed for many data management vendors to deal with, and too low level for many of the risk management system vendors to deal with. Into this gap in functionality falls the risk manager (supported by many spreadsheets!), who has to spend too much time organizing and validating data, and too little time on risk management itself.
Within risk management, I think it comes down to having the appropriate technical layers in place of data management, analytics/pricing management and risk model management. Ok it is a greatly simplified representation of the architecture needed (apologies to any techies reading this), but the majority of financial institutions do not have these distinct layers in place, with each of these layers providing easy "business user" access to allow risk managers to get to the "detail" of the data when regulators, auditors and clients demand it. Regulators are finally waking up to the data issue (see Basel on data aggregation for instance) but more work is needed to pull analytics into the technical architecture/strategy conversation, and not just confine regulatory discussions of pricing analytics to model risk.
Posted by Brian Sentance | 14 February 2013 | 2:50 pm
A little late on these notes from this PRMIA Event on Big Data in Risk Management that I helped to organize last month at the Harmonie Club in New York. Big thank you to my PRMIA colleagues for taking the notes and for helping me pull this write-up together, plus thanks to Microsoft and all who helped out on the night.
Introduction: Navin Sharma (of Western Asset Management and Co-Regional Director of PRMIA NYC) introduced the event and began by thanking Microsoft for its support in sponsoring the evening. Navin outlined how he thought the advent of “Big Data” technologies was very exciting for risk management, opening up opportunities to address risk and regulatory problems that previously might have been considered out of reach.
Navin defined Big Data as the structured or unstructured in receive at high volumes and requiring very large data storage. Its characteristics include a high velocity of record creation, extreme volumes, a wide variety of data formats, variable latencies, and complexity of data types. Additionally, he noted that relative to other industries, in the past financial services has created perhaps the largest historical sets of data and continually creates enormous amount of data on a daily or moment-by-moment basis. Examples include options data, high frequency trading, and unstructured data such as via social media. Its usage provides potential competitive advantages in a trading and investment management. Also, by using Big Data it is possible to have faster and more accurate recognition of potential risks via seemingly disparate data - leading to timelier and more complete risk management of investments and firms’ assets. Finally, the use of Big Data technologies is in part being driven by regulatory pressures from Dodd-Frank, Basel III, Solvency II, Markets for Financial Instruments Directives (1 & 2) as well as Markets for Financial Instruments Regulation.
Navin also noted that we will seek to answer questions such as:
- What is the impact of big data on asset management?
- How can Big Data’s impact enhance risk management?
- How is big data used to enhance operational risk?
Presentation 1: Big Data: What Is It and Where Did It Come From?: The first presentation was given by Michael Di Stefano (of Blinksis Technologies), and was titled “Big Data. What is it and where did it come from?”. You can find a copy of Michael’s presentation here. In summary Michael started with saying that there are many definitions of Big Data, mainly defined as technology that deals with data problems that are either too large, too fast or too complex for conventional database technology. Michael briefly touched upon the many different technologies within Big Data such as Hadoop, MapReduce and databases such as Cassandra and MongoDB etc. He described some of the origins of Big Data technology in internet search, social networks and other fields. Michael described the “4 V’s” of Big Data: Volume, Velocity, Variety and a key point from Michael was “time to Value” in terms of what you are using Big Data for. Michael concluded his talk with some business examples around use of sentiment analysis in financial markets and the application of Big Data to real-time trading surveillance.
Presentation 2: Big Data Strategies for Risk Management: The second presentation “Big Data Strategies for Risk Management” was introduced by Colleen Healy of Microsoft (presentation here). Colleen started by saying expectations of risk management are rising, and that prior to 2008 not many institutions had a good handle on the risks they were taking. Risk analysis needs to be done across multiple asset types, more frequently and at ever greater granularity. Pressure is coming from everywhere including company boards, regulators, shareholders, customers, counterparties and society in general. Colleen used to head investor relations at Microsoft and put forward a number of points:
- A long line of sight of one risk factor does not mean that we have a line of sight on other risks around.
- Good risk management should be based on simple questions.
- Reliance on 3rd parties for understanding risk should be minimized.
- Understand not just the asset, but also at the correlated asset level.
- The world is full of fast markets driving even more need for risk control
- Intraday and real-time risk now becoming necessary for line of sight and dealing with the regulators
- Now need to look at risk management at a most granular level.
Colleen explained some of the reasons why good risk management remains a work in progress, and that data is a key foundation for better risk management. However data has been hard to access, analyze, visualize and understand, and used this to link to the next part of the presentation by Denny Yu of Numerix.
Denny explained that new regulations involving measures such as Potential Future Exposure (PFE) and Credit Value Adjustment (CVA) were moving the number of calculations needed in risk management to a level well above that required by methodologies such as Value at Risk (VaR). Denny illustrated how the a typical VaR calculation on a reasonable sized portfolio might need 2,500,000 instrument valuations and how PFE might require as many as 2,000,000,000. He then explain more of the architecture he would see as optimal for such a process and illustrated some of the analysis he had done using Excel spreadsheets linked to Microsoft’s high performance computing technology.
Presentation 3: Big Data in Practice: Unintentional Portfolio Risk: Kevin Chen of Opera Solutions gave the third presentation, titled “Unintentional Risk via Large-Scale Risk Clustering”. You can find a copy of the presentation here. In summary, the presentation was quite visual and illustrating how large-scale empirical analysis of portfolio data could produce some interesting insights into portfolio risk and how risks become “clustered”. In many ways the analysis was reminiscent of an empirical form of principal component analysis i.e. where you can see and understand more about your portfolio’s risk without actually being able to relate the main factors directly to any traditional factor analysis.
Panel Discussion: Brian Sentance of Xenomorph and the PRMIA NYC Steering Committee then moderated a panel discussion. The first question was directed at Michael “Is the relational database dead?” – Michael replied that in his view relational databases were not dead and indeed for dealing with problems well-suited to relational representation were still and would continue to be very good. Michael said that NoSQL/Big Data technologies were complimentary to relational databases, dealing with new types of data and new sizes of problem that relational databases are not well designed for. Brian asked Michael whether the advent of these new database technologies would drive the relational database vendors to extend the capabilities and performance of their offerings? Michael replied that he thought this was highly likely but only time would tell whether this approach will be successful given the innovation in the market at the moment. Colleen Healy added that the advent of Big Data did not mean the throwing out of established technology, but rather an integration of established technology with the new such as with Microsoft SQL Server working with the Hadoop framework.
Brian asked the panel whether they thought visualization would make a big impact within Big Data? Ken Akoundi said that the front end applications used to make the data/analysis more useful will evolve very quickly. Brian asked whether this would be reminiscent of the days when VaR first appeared, when a single number arguably became a false proxy for risk measurement and management? Ken replied that the size of the data problem had increased massively from when VaR was first used in 1994, and that visualization and other automated techniques were very much needed if the headache of capturing, cleansing and understanding data was to be addressed.
Brian asked whether Big Data would address the data integration issue of siloed trading systems? Colleen replied that Big Data needs to work across all the silos found in many financial organizations, or it isn’t “Big Data”. There was general consensus from the panel that legacy systems and people politics were also behind some of the issues found in addressing the data silo issue.
Brian asked if the panel thought the skills needed in risk management would change due to Big Data? Colleen replied that effective Big Data solutions require all kinds of people, with skills across a broad range of specific disciplines such as visualization. Generally the panel thought that data and data analysis would play an increasingly important part for risk management. Ken put forward his view all Big Data problems should start with a business problem, with not just a technology focus. For example are there any better ways to predict stock market movements based on the consumption of larger and more diverse sources of information. In terms of risk management skills, Denny said that risk management of 15 years ago was based on relatively simply econometrics. Fast forward to today, and risk calculations such as CVA are statistically and computationally very heavy, and trading is increasingly automated across all asset classes. As a result, Denny suggested that even the PRMIA PRM syllabus should change to focus more on data and data technology given the importance of data to risk management.
Asked how best to should Big Data be applied?, then Denny replied that echoed Ken in saying that understanding the business problem first was vital, but that obviously Big Data opened up the capability to aggregate and work with larger datasets than ever before. Brian then asked what advice would the panel give to risk managers faced with an IT department about to embark upon using Big Data technologies? Assuming that the business problem is well understood, then Michael said that the business needed some familiarity with the broad concepts of Big Data, what it can and cannot do and how it fits with more mainstream technologies. Colleen said that there are some problems that only Big Data can solve, so understanding the technical need is a first checkpoint. Obviously IT people like working with new technologies and this needs to be monitored, but so long as the business problem is defined and valid for Big Data, people should be encouraged to learn new technologies and new skills. Kevin also took a very positive view that IT departments should be encouraged to experiment with these new technologies and understand what is possible, but that projects should have well-defined assessment/cut-off points as with any good project management to decide if the project is progressing well. Ken put forward that many IT staff were new to the scale of the problems being addressed with Big Data, and that his own company Opera Solutions had an advantage in its deep expertise of large-scale data integration to deliver quicker on project timelines.
Audience Questions: There then followed a number of audience questions. The first few related to other ideas/kinds of problems that could be analyzed using the kind of modeling that Opera had demonstrated. Ken said that there were obvious extensions that Opera had not got around to doing just yet. One audience member asked how well could all the Big Data analysis be aggregated/presented to make it understandable and usable to humans? Denny suggested that it was vital that such analysis was made accessible to the user, and there general consensus across the panel that man vs. machine was an interesting issue to develop in considering what is possible with Big Data. The next audience question was around whether all of this data analysis was affordable from a practical point of view. Brian pointed out that there was a lot of waste in current practices in the industry, with wasteful duplication of ticker plants and other data types across many financial institutions, large and small. This duplication is driven primarily by the perceived need to implement each institution’s proprietary analysis techniques, and that this kind of customization was not yet available from the major data vendors, but will become more possible as cloud technology such as Microsoft’s Azure develops further. There was a lot of audience interest in whether Big Data could lead to better understanding of causal relationships in markets rather than simply correlations. The panel responded that causal relationships were harder to understand, particularly in a dynamic market with dynamic relationships, but that insight into correlation was at the very least useful and could lead to better understanding of the drivers as more datasets are analyzed.
Posted by Brian Sentance | 8 February 2013 | 3:14 pm
I got my first tour around the NYSE trading floor on Wednesday night, courtesy of an event by Rutgers University on Risk. Good event, mainly around panel discussion moderated by Nicholar Dunbar (Editor of Bloomberg Risk newsletter), and involving David Belmont (Commonfund CRO), Adam Litke (Chief Risk Strategist for Bloomberg), Hilmar Schaumann (Fortress Investment CRO) and Sanjay Sharma (CRO of Global Arbitrage and Trading at RBC).
Nick first asked the panel how do you define and measure risk? Hilmar responded that risk measurement is based around two main activities: 1) understanding how a book/portfolio is positioned (the static view) and 2) understanding sensitivities to risks that impact P&L (the dynamic view). Hilmar mentioned the use of historical data as a guide to current risks that are difficult to measure, but emphasised the need for a qualitative approach when looking at the risks being taken.
David said that he looks at both risk and uncertainty - with risk being defined as those impacts you can measure/estimate. He said that historical analysis was useful but limited given it is based only on what has happened. He thought that scenario analysis was a stronger tool. (I guess with historical analysis you at least get some idea of the impact of things that could not be predicted even it is based on one "simulation" path i.e. reality, whereas you have more flexibility with scenario management to cover all bases, but I guess limited to those bases you can imagine). David said that path-dependent risks such as those in the credit markets in the last crisis were some of the most difficult to deal with.
Adam said that you need to understand why you are measuring risk and understand what risks you are prepared to take. He said that at Wachovia they knew that a 25% house price fall in California would be a near death experience for the bank prior to the 2008 crisis, and in the event the losses were much greater than 25%. His point was really that you must decide what risks you want to survice and at what level. He said that sound common-sense judgement is needed to decide whether a scenario is really-real or not.
Sanjay said that risk managers need to maintain a lot of humility and not to over-trust risk meaurements. He described a little of the risk approach used at RBC where he said they use over 80 different models and employ them as layers/different views on risk to be brought together. He said they start with VaR as a base analysis, but build on this with scenarios, greeks and then on to other more specific reports and analysis. He emphasised that communication is a vital skill for risk managers to get their views and ideas across.
Nicholas then moved on to ask how risk managers should make or reduce risks? - getting away from risk measurement to risk management. Adam said that risks should be delegated out to those that manage them but this needs to be combined with responsibility for the risks too. Keep people and departments within the bounds of what their remit. Be prepared to talk a different business language to different stakeholders dependent upon their understanding and their motivations. David gave some examples of this in his case, where endowment funds what risk premiums over many years and risks are translated/quantified into practical things for example such as a new college building not going ahead etc.
Hilmar said the hedge funds are supposed to take risks, and that the key was not necessarily to avoid losses (although avoid them if you can) but rather to avoid surprises. Like the other speakers, Hilmar emphasised that communication of risks to key stakeholders was vital. He also added the key point that if you don't like a risk you have identified, then try first to take it off rather than hedging it, since hedging could potentially add basis risk and simple more complication.
Nicholas then Sanjay about how risk managers should deal with bringing difficult news to the business? Sanjay suggested that any bad news should be approach in the form of "actionable transparency" i.e. that not only do you say communicate how bad the risk is to all stakeholders but you come along with actionable approaches to dealing with the risk. In all of his experience and despite the crisis, Sanjay's experience is that traders do not want to loose money and if you come with solutions they will listen. He concluded by saying that qualitative analysis should also be used, citing the hypothetical example that you should take notice of dogs (yes, the animal!) buying mortgages, whether or not the mortgages are AAA rated.
Nicholas asked the panel members in turn what risks are they concerned about currently? David said he believed that many risks were not priced into the market currently. He was concerned about policy impacts of action by the ECB and the Fed, and thought the current and forward levels of volatility are low. In Fixed Income markets he thought that Dodd-Frank may have detrimental effects, particular with the current lack of clarity about what is proprietary trading and what is market-making. He thought that should policies and interests rates change, he thought that risk managers should look carefully at what will happen as funds flow out of fixed income and into equities.
Hilmar talked about the postponement of the US debt ceiling limits and that US Government policy battles continue to be an obvious source of risk. In Europe, many countries had elections this year which would be interesting, and that the problems in the Euro-zone are less than they were, but problems in Cyprus could fan the flames of more problems and anxiety. Hilmar said the Japan's new policy of targetting 2% inflation may have effects on the willingness of domestic investors to buy JGBs.
Sanjay said he was worried. In the "Greenspan Years" prior to 2008 a quasi government guarantee on the banks was effectively put in place and that we continue to live with cheap money. When policy eventually changes and interest rates rise, Sanjay wondered whether the world was ready for the wholesale asset revaluation that would then be required.
Adams concerns where mainly around identifying what will be the cause of the next panic in the market. Whilst he said he is in favour of central clearing for OTC derivatives, he thought that the changing market structure combined with implementing central clearing had not been fully thought through and this was a worry to him.
Nicholas asked what do the panelists think to the regulation being implemented? David said that regulators face the same difficulty that risk managers face, in that nobody notices when you took sensible action to protect against a risk that didn't occur. He thinks that regulation of the markets is justified and necessary.
Sanjay said that in the airline and pharmacutical industries regulatory approval was on the whole very robust but that they were dealing with approving designs (aeroplanes and drugs) that are reproduced once approved. He said that such levels of regulation in financial services were not yet possible due to the constant innovation found in the markets, and he wanted regulation to be more dynamic and responsive to market developments. Sanjay also joined those in the industry that are critical of the shear size of Dodd-Frank.
Nicholas said that Adam was obviously keen on operational issues and wondered what plumbing in the industry would he change? Adam said that he is a big fan of automation but operational risk are real and large. He thought that there were too many rules and regulations being applied, and the regulators were not paying attention to the type of markets they want in the future, nor on the effects of current regulation and how people were moving from one part of the industry to another. Adam said that in relation to Knight Capital he was still a strong advocate of standing by the wall socket, ready to pull the plug on the computer. Adam suggested that regulators should look at regulating/approving software releases (I assume here he means for key tasks such as automated trading or risk reporting, not all software).
Given the large number of students present, Nicholas closed the panel by asking what career advice the panelists had for future risk managers? Adam emphasised flexibility in role, taking us through his career background as an equity derivatives and then fixed income trader before coming into risk management. Adam said it was highly unlikely over your career that you would stay with one role or area of expertise.
Hilmar said that having risk managers independent of trading was vitally important for the industry. He thought there were many areas to work with operational risk being potentially the largest, but still with plenty more to do in market risk, compliance and risk modelling. He added that understanding the interdepencies between risks was key and an area for further development.
When asked by Nicholas, David said that risk managers should have a career path right through to CEO of an institution. He wanted to encourage risk management as a necessary level above risk measurement and control. He was excited about the potential of Big Data technologies to help in risk management. David gave some interesting background on his own career initially as an emergining markets debt trader. He said that it is important to know yourself, and that he regarded himself as a sceptic, needing all the information available before making a decision. As such his performance as a trader was consistent but not as high as some, and this became one of the reasons he moved into risk management.
Sanjay said many of the systems used in finance are 20 years old, in complete contrast with the advancies in mobile and internet technologies. As such he thought this was a great opportunity to be involved in the replacement and upgrading of this older infrastructure. Apparently one analyst had estimated that $65B will be spent on risk management over the next 4-5 years.
Adam thought that there was a need for code of ethics for quants (see old post for some ideas). Sanjay added that the industry needed to move away from being involved primarily in attempting to optimise activity around gaming regulation. When asked by Nicholas about Basel III, Adam thought that improved regulation was necessary but Basel III was not the right way to go about it and was way too complex.
Posted by Brian Sentance | 1 February 2013 | 2:41 pm
Posted by Brian Sentance | 22 January 2013 | 3:14 pm
In relation to the Microsoft/PRMIA event that Brian moderated at last night in New York, I spotted this article recently that tries to map out all the different databases that are now commercially available in some form, from SQL to No SQL and all the various incarnations and flavours in between:
As Brian suggested in his recent post, It's amazing to see how much the landscape has evolved from the domination (mantra?) that there was the relational way, or no way. Obviously times have moved on (er, I guess the Internet happened for one thing...) and people are now far more accepting of the need for different approaches to different types and sizes of business problems. That said, I agree with the article and comments that suggest there do seem to be far too many options available now - there has to be some consolidation coming otherwise it will become increasingly difficult to know where to start. Choice is a wonderful thing, but only in moderation!
Posted by Chris Budgen | 16 January 2013 | 9:30 pm
Quick thank you to all those who came along to Xenomorph's New York Holiday Party at the Classic Car Club. Below is an extract from talk given by Paul Rowady of the Tabb at the event, followed by my effort and some photographs from the event.
There Is No Such Thing as Alpha Generation
The change in perspective caused by a subtle change in language can galvanize your approach to data, the tools you select, and even the organizational culture. That said, ‘alpha generation’ is a myth; there is only alpha discovery and capture.
By E. Paul Rowady, Jr.
We live in an age of superlatives: unprecedented market complexity and uncertainty caused, in part, by an unprecedented regulatory onslaught and unprecedented economic extremes. As a result, there is an unprecedented focus on risk analysis – and an unprecedented (and anxious) search for new sources of performance from all market demographics.
The big data era is here and will only become the bigger data era. What we need is a new perspective. But fostering such a new perspective may be as subtle as performing a little linguistic jujitsu.
Our business – trading and investment in capital and commodity markets around the globe – has a history of being cavalier or too casual about language; particularly how certain labels, terms or vernacular are used to describe the business and the markets. Some of this language is intentional – the use of certain terminology creates mystique, fosters mythology, manufactures a sense of complexity that only a select group of savants can tame -- particularly when it comes to activities around quantitative methods. And some of it is just plain laziness, stretching the use of labels far beyond their original meaning on the idea that these terms are close enough.
I have become increasingly sensitive to this phenomenon over the years. Call it an insatiable need to simplify complexity, bring order to chaos, to enhance a level of accuracy and precision in how we describe what we do and how we do it. I find that precision of language does impact how complex technical topics are communicated, understood and absorbed. It turns out, language impacts perspective – and perspective impacts strategy and tactics.
So let’s gain a little perspective on alpha generation and alpha creation...(full extract can be found on the TabbFORUM)
Paul in full speech mode at the Classic Car Club
Big thanks to Paul for the above talk. Here's is my follow-up:
Thanks Paul for a great talk, certainly I agree that people, process, technology and data are key to the future success of financial markets. In particular, I think attitudes towards data must change if we are to meet the coming challenges over the next few years. For example, in my view data in financial markets is analogous to water:
- Everyone needs it
- Everyone knows where to get it
- Nobody likes to share it
- Nobody is 100% sure where was really sourced from
- Nobody is quite sure where it goes to
- Nobody knows its true cost
- Nobody knows how much is wasted
- Everyone assumes it is of high quality
- And you only ever know it has gone bad after you have drunk it.
- (I should add, that if you own water you are also very wealthy, so wealthy your neighbor might even consider robbing you)
The problem of siloed data and data integration remains, but this is as much a political as opposed to purely technical problem. People need to share data more, and I wonder (I hope) that as the “social network” generation come through that attitudes will improve, but I guess this will also add different pressures to data aggregators as people are less hung up about sharing information. The focus needs to be on the data that business folks need, and should be less about the type of the data or the technical means by which it is captured, stored and distributed – for sure these are important aspects, but we need involve more people in realizing this cult of data.
And just as Paul has issues with the over-use of “Alpha”, I promise this will be the only time this evening I will mention “Big Data” but today I heard the best description so far of what big data is all about, which is “Big data is like watching the planet develop a nervous system”. Data is fundamental to all of our lives and we are living through some very interesting times in terms of how much data is becoming available and how we make sense of it.
So, a change of tack. When moving to the New York area a few years back, one of my fellow Brits said that you will find the Americans a lot friendlier than the English, but don’t talk to them about politics or religion. So rules are meant to broken, and religion aside I thought I would briefly have to mention the recent election as one of the big differences between the UK and the USA.
Firstly, wow you guys know how to have long elections. I think the French get theirs done in two weeks but even the Brits do it in a month. A few things struck me from the election: I don’t know whether the Democratic Party is generally supportive of legalizing drugs, but I think we can be certain that President Obama spent some time in the states of Colorado and Washington prior to the first debate.
And I hear from the New Yorker that the Republicans are trying a radical new approach to broaden the demographic of the supporter base, apparently to make it inclusive of people who have strong believers in “maths and science”.
Moving on from a light-hearted look at elections but sticking with the government theme, the regulation is obviously very high profile at the moment. To some degree this is understandable as financial markets have been doing a great job of keeping a low profile with:
- JPMorgan $7B London Whale
- Barclays and the Libor rigging
- Standard Chartered and Iranian money laundering
- Knight Capital with the biggest advertisement in history for automated trading
- ING feeling it was missing out on things with Cuba and Iranian money
- HSBC helping Mexican drug lords to move the money around
- Capital One deceiving its customers
- Peregrine Financial Group deceiving the regulators (generating alpha?)
All these occurred in 2012, when it seems that the dust had barely settled over MF Global and UBS. So it is possible to understand the reaction of people and politicians to what has gone on and the need for more stable capital markets, but my biggest concern is that there is simply too much regulation, and complex systems with complex rules is a great breeding ground for the law of unintended consequences. To illustrate how over time we humans, and in particular governments, seem to be regressing in terms of using more words to describe ever more complex behaviours I found the following list online:
- Pythagoras 24 words
- Lords Prayer 66 words
- Archidmedies Priciple 66 words
- 10 commandments 179 words
- Gettysburg Address 286 words
- Declaration of independence 1300 words
- US Govt sale of cabbage 26,991 words
Dodd-Frank is about 2,300 pages, which apparently is going to spawn some 30,000 pages of rules – that is enormous. Listening to a regulator speak last week, he said the regulators had about 10,000 pages done, 10,000 in progress and 10,000 not even started yet. Worse than this, he added that regulators were not trying to shape the financial markets of the future but rather dealing only with the current issues. Regulators should take their lead from quantum physics in my view, as soon as you observe something it is changed. Financial markets are complex, and making them even more complex through overlaying complex rules is not going to result in the stability that we all desire.
Anyway, thanks for coming along this evening and I hope you have a great time. Quick thank you to our clients and partners without whom we would not exist. Thanks to the hard work our staff put in over the year, but in particular thanks to Naj and Xenomorph's NYC team for organizing this evenings event.
Some photographs from the event below. Big thanks to NandoVision for some of the images:
Clients, partners and staff catch up over a drink or three
This waiter had a pleasant interuption in service prior to the fashion show by Hiliary Flowers
Jim Beck talks with PRMIA NYC members: Qi Fu, Sol Steinberg and Don Wesnofske
Cass Almendral, Hillary Flowers and Brian later at the bar
Not sure how this ballet-themed dress works in a convertible?
Russ Glisker and Mark O'Donnell talk cars with Paul
A far more practical outfit for this Porsche
Some of the fashion models rush to discuss the finer points of Alpha Harvesting with Paul...
Thanks again to all involved in putting the party together and for everyone who came along on the night. If I don't get round to another post over the Holiday Season, then best wishes for a fantastic break and a great start to 2013.
Posted by Brian Sentance | 19 December 2012 | 12:48 am
Good breakfast event from SAP and A-Team last Thursday morning. SAP have been getting (and I guess paying for) a lot of good air-time for their SAP Hana in-memory database technology of late. Domenic Iannaccone of SAP started the briefing with an introduction to big data in finance and how their SAP/Sybase offerings knitted together. He started his presentation with a few quotes, one being "Intellectual property is the oil of the 21st century" by Mark Getty (he of Getty images, but also of the Getty oil family) and "Data is the new oil" by both Clive Humby and Gerd Leonhard (not sure why two people quoted saying the same thing but anyway).
For those of you with some familiarity with the Sybase IQ architecture of a year or two back, then in this architecture SAP Hana seems to have replaced the in-memory ASE database that worked in tandem with Sybase IQ for historical storage (I am yet to confirm this, but hope to find out more in the new year). When challenged on how Hana differs from other in-memory database products, Domenic seemed keen to emphasise its analytical capabilities and not just the database aspects. I guess it was the big data angle of bring the "data closer to the calculations" was his main differentiator on this, but with more time I think a little bit more explanation would have been good.
Pete Harris of the A-Team walked us through some of the key findings of what I think is the best survey I have read so far on the usage of big data in financial markets (free sign-up needed I think, but you can get a copy of the report here). Some key findings from a survey of staff at ten major financial institutions included:
- Searching for meaning in instructured data was a leading use-case thought of when thinking of big data (Twitter trading etc)
- Risk management was seen as a key beneficiary of what the technologies can offer
- Aggregation of data for risk was seen as a key application area concerning structured data.
- Both news feed but also (surprisingly?) text documents were key unstructured data sources being processed using big data.
- In trading news sentiment and time series analysis were key areas for big data.
- Creation of a system wide trade database for surveillance and compliance was seen as a key area for enhancement by big data.
- Data security remains a big concern with technologists over the use of big data.
There were a few audience questions - Pete clarified that there was a more varied application of big data amongst sell-side firms, and that on the buy-side it was being applied more KYC and related areas. One of the audience made that point that he thought a real challenge beyond the insight gained from big data analysis was how to translate it into value from an operational point of view. There seemed to be a fair amount of recognition that regulators and auditors are wanting a full audit trail of what has gone on across the whole firm, so audit was seen as a key area for big data. Another audience member suggested that the lack of a rigid data model in some big data technologies enabled greater flexibility in the scope of questions/analysis that could be undertaken.
Coming back to the key findings of the survey, then one question I asked Pete was whether or not big data is a silver bullet for data integration. My motivation was that the survey and much of the press you read talks about how big data can pull all the systems, data and calculations together for better risk management, but while I can understand how massively scaleable data and calculation capabilities was extremely useful, I wondered how exactly all the data was pulled together from the current range of siloed systems and databases where it currently resides. Pete suggested that this was stil a problematic area where Enterprise Application Integration (EAI) tools were needed. Another audience member added that politics within different departments was not making data integration any easier, regardless of the technologies used.
Overall a good event, with audience interaction unsurprisingly being the most interesting and useful part.
Posted by Brian Sentance | 3 December 2012 | 2:12 pm
Launch event for Interactive Data's new reference data service Apex on Wednesday night, hosted at Nasdaq Time Square and introduced by Mark Hepsworth. Apex looks like a good offering, combining multi-asset data access, batch file and on-demand API requests from the same data store, plus hosted data management services, and a flexible licensing/distribution/re-distribution model.
Some good speakers at the event. Larry Tabb ran through his opinions on the current market, starting with regulation. He painted a mixed picture of the market, starting with the continuing exit by investors from the equity mutual funds market, offset to some degree by rapid growth in ETF assets (54% growth over past 3 years to $1,200billion). Obviously events such as the Flash Crash, Libor, the London Whale and Knight Capital have not increased investors confidence in markets either.
On regulation he first cited the sheer amount of regulation being attempted at the moment going through systemic risk/too big to fail, Dodd-Frank, Volcker, derivatives regulation, Basel III etc. Of particular note he mentioned some concerns over whether there is simply enough collateral around in the market given increased capital requirements and derivative regulation (a thought currently shared by the FT apparently in this article).
Given the focus of the event, Larry unsurprisingly mentioned the foundational role of data in meeting the new regulatory requirements, which for the next few years he believes will be focussed on audit and the ability to explain and justify past decisions to regulators. Also given the focus of the event, Larry did not mention his recent article on the Tabb Forum on federated data management strategies which I would have been interested to hear Interactive's comments on, particularly given their new hosted data management offerings. (You can find some of our past thoughts here on the option of using federated data.)
Mike Atkin of the EDM Council was next up and described a framework for what he thought was going on in the market. In summary, he split the drivers for change into business and regulatory, and categorised the changes into:
- Systemic Risk
- Capital and Liquidity
- Clearing and Settlement
- Control and Enforcement
He then that the fundamental challenge with data was to go through the chain of identifying things, descibing them, classifying/aggregating them and then finally establishing linkages. He then ended this part of his presentation with the three aspects he thought necessary to sort this out from industry data standards, to methods of best practice and on to having infrastructure in place to enable these changes.
Mike then went on to recount a conversation he had had with a hedge fund manager, who had defined the interesting concept of a "Data Risk Equation":
N x CC x S / (Q x V)
N: is the number of variables
CC: is a measure of calculation complexity
S: is the number of data sources needed
Q: is a measure of quality
V: is a measure of verifiability
I think the angle was the Hedge Fund guy was simply using a form of the above to categorise and compary the complexity of some of the data issues his firm was dealing with.
Aram Flores of Deutsche Bank then talked briefly. Of note was his point that the new regulation was forcing DB to use more external rather than internal data, since regulation now restricted the use of internal data within regulatory reporting. Sounds like good news for Interactive and some of its competitors. Eric Reichenberg of SS&C GlobeOp then gave a quick talk on the importance of accurate data to his derivative valuation services. The talks ended with a well-prepped conversation between Marty Williams and one of their new Apex clients, who jokingly refered to one of the other well-known data vendors as the Evil Empire which raised a few smiles - fortunately the speaker didn't start to choke at this point so obviously Darth Vader wasn't spying on the proceedings...
So overall a good event, new product offering looks interesting, speakers were entertaining and the drinks/food/location were great.
Posted by Brian Sentance | 26 October 2012 | 3:22 pm
Getting to the heart of "Data Management for Risk", PRMIA held an event entitled "Missing Data for Risk Management Stress Testing" at Bloomberg's New York HQ last night. For those of you who are unfamiliar with the topic of "Data Management for Risk", then the following diagram may help to further explain how the topic is to do with all the data sets feeding the VaR and scenario engines.
I have a vested interest in saying this (and please forgive the product placement in the diagram above, but hey this is what we do...), but the topic of data management for risk seems to fall into a functionality gap between: i) the risk system vendors who typically seem to assume that the world of data is perfect and that the topic is too low level to concern them and ii) the traditional data management vendors who seem to regard things like correlations, curves, spreads, implied volatilities and model parameters as too business domain focussed (see previous post on this topic) As a result, the risk manager is typically left with ad-hoc tools like spreadsheets and other analytical packages to perform data validation and filling of any missing data found. These ad-hoc tools are fine until the data universe grows larger, leading to the regulators becoming concerned about just how much data is being managed "out of system" (see past post for some previous thoughts on spreadsheets).
The Crisis and Data Issues. Anyway enough background above and on to some of the issues raised at the event. Navin Sharma of Western Asset Management started the evening by saying that pre-crisis people had a false sense of security around Value at Risk, and that crisis showed that data is not reliably smooth in nature. Post-crisis, then questions obviously arise around how much data to use, how far back and whether you include or exclude extreme periods like the crisis. Navin also suggested that the boards of many financial institutions were now much more open to reviewing scenarios put forward by the risk management function, whereas pre-crisis their attention span was much more limited.
Presentation. Don Wesnofske did a great presentation on the main issues around data and data governance in risk (which I am hoping to link to here shortly...)
Issues with Sourcing Data for Risk and Regulation. Adam Litke of Bloomberg asked the panel what new data sourcing challenges were resulting from the current raft of regulation being implemented. Barry Schachter cited a number of Basel-related examples. He said that the costs of rolling up loss data across all operations was prohibitative, and hence there were data truncation issues to be faced when assessing operational risk. Barry mentioned that liquidity calculations were new and presenting data challenges. Non centrally cleared OTC derivatives also presented data challenges, with initial margin calculations based on stressed VaR. Whilst on the subject of stressed VaR, Barry said that there were a number of missing data challenges including the challenge of obtaining past histories and of modelling current instruments that did not exist in past stress periods. He said that it was telling on this subject that the Fed had decided to exclude tier 2 banks from stressed VaR calculations on the basis that they did not think these institutions were in a position to be able to calculate these numbers given the data and systems that they had in place.
Barry also mentioned the challenges of Solvency II for insurers (and their asset managers) and said that this was a huge exercise in data collection. He said that there were obvious difficulties in modelling hedge fund and private equity investments, and that the regulation penalised the use of proxy instruments where there was limited "see-through" to the underlying investments. Moving on to UCITS IV, Barry said that the regulation required VaR calculations to be regularly reviewed on an ongoing basis, and he pointed out one issue with much of the current regulation in that it uses ambiguous terms such as models of "high accuracy" (I guess the point being that accuracy is always arguable/subjective for an illiquid security).
Sandhya Persad of Bloomberg said that there were many practical issues to consider such as exchanges that close at different times and the resultant misalignment of closing data, problems dealing with holiday data across different exchanges and countries, and sourcing of factor data for risk models from analysts. Navin expanded more on his theme of which periods of data to use. Don took a different tack, and emphasised the importance of getting the fundamental data of client-contract-product in place, and suggested that this was a big challenge still at many institutions. Adam closed the question by pointing out the data issues in everyday mortgage insurance as an example of how prevalant data problems are.
What Missing Data Techniques Are There? Sandhya explained a few of the issues her and her team face working at Bloomberg in making decisions about what data to fill. She mentioned the obvious issue of distance between missing data points and the preceding data used to fill it. Sandhya mentioned that one approach to missing data is to reduce factor weights down to zero for factors without data, but this gave rise to a data truncation issue. She said that there were a variety of statistical techniques that could be used, she mentioned adaptive learning techniques and then described some of the work that one of her colleagues had been doing on maximum-likehood estimation, whereby in addition to achieving consistency with the covariance matrix of "near" neighbours, that the estimation also had greater consistency with the historical behaviour of the factor or instrument over time.
Navin commented that fixed income markets were not as easy to deal with as equity markets in terms of data, and that at sub-investment grade there is very little data available. He said that heuristic models where often needed, and suggested that there was a need for "best practice" to be established for fixed income, particularly in light of guidelines from regulators that are at best ambiguous.
I think Barry then made some great comments about data and data quality in saying that risk managers need to understand more about the effects (or lack of) that input data has on the headline reports produced. The reason I say great is that I think there is often a disconnect or lack of knowledge around the effects that input data quality can have on the output numbers produced. Whilst regulators increasingly want data "drill-down" and justfication on any data used to calculate risk, it is still worth understanding more about whether output results are greatly sensitive to the input numbers, or whether maybe related aspects such as data consistency ought to have more emphasis than say absolute price accuracy. For example, data quality was being discussed at a recent market data conference I attended and only about 25% of the audience said that they had ever investigated the quality of the data they use. Barry also suggested that you need to understand to what purpose the numbers are being used and what effect the numbers had on the decisions you take. I think here the distinction was around usage in risk where changes/deltas might be of more important, whereas in calculating valuations or returns then price accuracy might receieve more emphasis.
How Extensive is the Problem? General consensus from the panel was that the issues importance needed to be understood more (I guess my experience is that the regulators can make data quality important for a bank if they say that input data issues are the main reason for blocking approval of an internal model for regulatory capital calculations). Don said that any risk manager needed to be able to justify why particular data points were used and there was further criticism from the panel around regulators asking for high quality without specifying what this means or what needs to be done.
Summary - My main conclusions:
- Risk managers should know more of how and in what ways input data quality affects output reports
- Be aware of how your approach to data can affect the decisions you take
- Be aware of the context of how the data is used
- Regulators set the "high quality" agenda for data but don't specify what "high quality" actually is
- Risk managers should not simply accept regulatory definitions of data quality and should join in the debate
Great drinks and food afterwards (thanks Bloomberg!) and a good evening was had by all, with a topic that needs further discussion and development.
Posted by Brian Sentance | 16 October 2012 | 3:21 pm
Bankenes Sikringsfond Selects Xenomorph's TimeScape for Faster Data Analysis and High-Quality Decision Support
Just a quick note to say that we have signed a new client, Bankenes Sikringsfond, the Norwegian Banks’ Guarantee Fund. They will be using TimeScape to fulfill requirements for a centralised analytics and data management platform. The press release is available here for those of you who are interested.
Posted by Sara Verri | 11 October 2012 | 10:50 am
Just back from a good vacation (London Olympics followed by a sunny week in Portugal - hope your summer has gone well too) and enjoyed a great evening at a Quafafew event on Tuesday evening, entitled "Reverse Stress Testing & Roundtable on Managing Hedge Fund Risk".
Reverse Stress Testing
The first part of the evening was a really good presentation by Daniel Satchkov of Rixtrema on reverse stress testing. Daniel started the evening by stating his opinion that risk managers should not consider their role as one of trying to predict the future, but rather one more reminiscent of "car crash testing", where the role of the tester is one of assessing, managing and improving the response of a car to various "impacts", without needing to understand the exact context of any specific crash such as "Who was driving?", "Where did the accident take place?" or "Whose fault was it?". (I guess the historic context is always interesting, but will be no guide to where, when and how the next accident takes place).
Daniel spent some of his presentation discussing the importance of paradigms (aka models) to risk management, which in many ways echos many of themes from the modeller's manifesto. Daniel emphasised the importance of imagination in risk management, and gave a quick story about a German professor of mathematics who when asked the whereabouts of one of his new students replied that "he didn't have enough imagination so he has gone off to become a poet".
In terms of paradigms and how to use them, he gave the example of Brownian motion and described how the probability of all the air in the room moving to just one corner was effectively zero (as evidenced by the lack of oxygen cylinders brought along by the audience). However such extremes were not unusual in market prices, so he noted how Black-Scholes was evidently the wrong model, but when combined with volatility surfaces the model was able to give the right results i.e. "the wrong number in the wrong formula to get the right price." His point here was that the wrong model is ok so long as you aware of how it is wrong and what its limatations are (might be worth checking out this post containing some background by Dr Yuval Millo about the evolution of the options market).
Daniel said that he disagreed with the premise by Taleb that the range of outcomes was infinite and that as a result all risk managers should just give up and buy and a lottery ticket, however he had some sympathies with Taleb over the use of stable correlations within risk management. His illustration was once again entertaining in quoting a story where a doctor asks a nurse what the temperature is of the patients at a Russian hospital, only to be told that they were all "normal, on average" which obviously is not the most useful medical information ever provided. Daniel emphasised that contrary to what you often read correlations do not always move to one in a crisis, but there are often similarities from one crisis to the next (maybe history not repeating itself but more rhyming instead). He said that accuracy was not really valid or possible in risk management, and that the focus should be on relative movements and relative importance of the different factors assessed in risk.
Coming back to the core theme of reverse stress testing, then Daniel presented a method by which from having categorised certain types of "impacts" a level of loss could be specified and the model would produce a set of scenarios that produce the loss level entered. Daniel said that he had designed his method with a view to producing sets of scenarios that were:
- not missing any key dangers
He showed some of the result sets from his work which illustrated that not all scenarios were "obvious". He was also critical of addressing key risk factors separately, since hedges against different factors would be likely to work against each other in times of crisis and hedging is always costly. I was impressed by his presentation (both in content and in style) and if the method he described provides a reliable framework for generating a useful range of possible scenarios for a given loss level, then it sounds to me like a very useful tool to add to those available to any risk manager.
Managing Hedge Fund Risk
The second part of the evening involved Herb Blank of S-Network (and Quafew) asking a few questions to Raphael Douady, of Riskdata and Barry Schachter of Woodbine Capital. Raphael was an interesting and funny member of the audience at the Dragon Kings event, asking plenty of challenging questions and the entertainment continued yesterday evening. Herb asked how VaR should be used at hedge funds, to which Raphael said that if he calculated a VaR of 2 and we lost 2.5, he would have been doing his job. If the VaR was 2 and the loss was 10, he would say he was not doing his job. Barry said that he only uses VaR when he thinks it is useful, in particular when the assumptions underlying VaR are to some degree reflected in the stability of the market at the time it is used.
Raphael then took us off on an interesting digression based on human perceptions of probability and statistical distributions. He told the audience that yesterday was his eldest daughter's birthday and what he wanted was for the members of the audience to write down on paper what was a lower and upper bound of her age to encompass a 99th percentile. As background, Raphael looks like this. Raphael got the results and found that out of 28 entries, the range of ages provided by 16 members of the audience did not cover his daughters age. Of the 12 successful entries (her age was 25) six entries had 25 as the upper bound. Some of the entries said that she was between 18 and 21, which Raphael took to mean that some members of the audience thought that they knew her if they assigned a 99th percentile probability to their guess (they didn't). His point was that even for Quafafewers (or maybe Quafafewtoomuchers given the results...) then guessing probabilities and appropriate ranges of distributions was not a strong point for many of the human race.
Raphael then went on to illustrate his point above through saying that if you asked him whether he thought the Euro would collapse, then on balance he didn't think it was very likely that this will happen since he thinks that when forced Germany would ultimately come to the rescue. However if you were assessing the range of outcomes that might fit within the 99th percentile distribution of outcomes, then Raphael said that the collapse of the Euro should be included as a possible scenario but that this possibility was not currently being included in the scenarios used by the major financial institutions. Off on another (related) digression, Raphael said that he compared LTCM with having the best team of Formula 1 drivers in the world that given a F1 track would drive the fastest and win everything, but if forced to drive an F1 car on a very bumpy road this team would be crashing much more than most, regardless of their talent or the capabilities of their vehicle.
Barry concluded the evening by saying that he would speak first, otherwise he would not get chance to given Raphael's performance so far. Again it was a digression from hedge fund risk management, but he said that many have suggested that risk managers need to do more of what they were already doing (more scenarios, more analysis, more transparency etc). Barry suggested that maybe rather than just doing more he wondered whether the paradigm was wrong and risk managers should be thinking different rather than just more of the same. He gave one specific example of speaking to a structurer in a bank recently and asking given the higher hurdle rates for capital whether the structurer should consider investing in riskier products. The answer from the structurer was the bank was planning to meet about this later that day, so once again it would seem that what the regulators want to happen is not necessarily what they are going to get...
Posted by Brian Sentance | 30 August 2012 | 1:44 pm
Seems like Thomson Reuters have finally caught up (been forced to catch up?) with Bloomberg on the more open usage of instrument codes with the lifting of restrictions on the usage of RICs (see Finextra article). They have not gone as far as open sourcing RIC codes as Bloomberg has with its Open Symbology intiative. Bloomberg are still going to push the virtues of going fully open source with their codes (see comment on the end of the Finextra article), but at least with RICs being usable outside of Thomson Reuters systems and customers, then at least the industry seems making some pragmatic steps forward on instrument identifiers.
Posted by Brian Sentance | 29 June 2012 | 5:00 pm
Just a quick note to say that the video, presentations and supporting documents have now gone up for our recent Wilmott event with Numerix on OIS Curves and Libor in New York. Somewhat topical at the moment given the current bad press for Barclays.
Posted by Brian Sentance | 29 June 2012 | 2:20 pm
Some recent thoughts in Advanced Trading on turning data management on its head, and how to extend data management initiatives from the back office into both risk management and the front office.
Posted by Brian Sentance | 22 June 2012 | 2:17 pm
I attended the Financial Information Summit event on Tuesday, organized in Paris by Inside Market Data and Inside Reference Data.
Unsurprisingly, most of the topics discussed during the panels focused on reducing data costs, managing the vendor relationship strategically, LEI and building sound data management strategies.
Here is a (very) brief summary of the key points touched which generated a good debate from both panellists and audience:
Lowering data costs and cost containment panels
- Make end-users aware of how much they pay for that data so that they will have a different perspective when deciding if the data is really needed or a "nice to have"
- Build a strong relationship with the data vendor: you work for the same aim and share the same industry issues
- Evaluate niche data providers who are often more flexible and willing to assist while still providing high quality data
- Strategic vendor management is needed within financial institutions: this should be an on-going process aimed to improve contract mgmt for data licenses
- A centralized data management strategy and consolidation of processes and data feeds allow cost containment (something that Xenomorph have long been advocating)
- Accuracy and timeliness of data is essential: make sure your vendor understands your needs
- Negotiate redistribution costs to downstream systems
One good point was made by David Berry, IPUG-Cossiom, on the acquisition of data management software vendors by the same data providers (referring to the Markit-Cadis and PolarLake-Bloomberg deals) and stating that it will be tricky to see how the two business units will be managed "separately" (if kept separated...I know what you are thinking!).
There were also interesting case studies and examples supporting the points above. Many panellists pointed out how difficult can be to obtain high quality data from vendors and that only regulation can actually improve the standards. Despite the concerns, I must recognize that many firms are now pro-actively approaching the issue and trying to deal with the problem in a strategic manner. For example, Hand Henrik Hovmand, Market Data Manager, Danske Bank, explained how Danske Bank are in the process of adopting a strategic vendor system made of 4 steps: assessing vendor, classifying vendor, deciding what to do with the vendor and creating a business plan. Vendors are classified as strategic, tactical, legacy or emerging. Based on this classification, then the "bad" vendors are evaluated to verify if they are enhancing data quality. This vendor landscape is used both internally and externally during negotiation and Hovmand was confident it will help Danske Bank to contain costs and get more for the same price.
I also enjoyed the panel on Building a sound management strategy where Alain Robert- Dauton, Sycomore Asset Management, was speaking. He highlighted how asset managers, in particular smaller firms, are now feeling the pressure of regulators but at the same time are less prepared to deal with compliance than larger investment banks. He recognized that asset managers need to invest in a sound risk data management strategy and supporting technology, with regulators demanding more details, reports and high quality data.
For a summary on what was said on LEI, then seems like most financial institutions are still unprepared on how it should be implemented, due to uncertainty around it but I refer you to an article from Nicholas Hamilton in Inside Reference Data for a clear picture of what was discussed during the panel.
Looking forward, the panellists agreed that the main challenge is and will be managing the increasing volume of data. Though, as Tom Dalglish affirmed, the market is still not ready for the cloud, given than not much has been done in terms of legislation. Watch out!
The full agenda of the event is available here.
Posted by Sara Verri | 14 June 2012 | 5:54 pm
Quick plug for Xenomorph's Wilmott Forum Event on OIS curves tomorrow in downtown Manhattan. The event is done in partnership with Numerix, and will be looking at the issue of OIS vs. Libor discounting from the point of view of a practioner, financial engineer and systems developer. You can register for the event here, and so we hope to see you at 6pm for some great talks and some drinks/socialising afterwards.
Posted by Brian Sentance | 30 May 2012 | 2:07 pm
Good Quafafew event in NYC this week, with Michael Markov of MPI on "Hedge Fund Replication: Methods, Challenges and Benefits for Investors". To cut a relatively long but enjoyable presentation short, Michael presented some interesting empirical evidence about hedge fund performance.
Firstly, he showed how many (most) hedge fund styles were able to deliver performance that had better risk/return profile than many mainstream investment portfolios, obviously including the ubiquitous 60% in equity 40% in bonds strategy. Given this relative outperformance in terms of risk and return for many hedge fund styles, Michael put forward the idea that asset managers seeking to invest in hedge funds should take more interest in indices of hedge funds than is currently the case.
For a particular hedge fund style, to obtain a performance level that was better than 50% of the managers was actually quite good, particularly when he showed that the risk level was approximately better than 75% of the hedge funds within each class. Also, when you look at the performance over longer time periods (rolling 3 years say) an index outperformed many more of the funds in a particular investment style (sounds like a bit of the advantages of geometric vs. arithmetic averaging at work somewhere in this to me).
As an aside, he said that most hedge fund replication products do not mention tracking error and often instead talk about near perfect correlation with the hedge fund index being replicated. He was at pain to point out that it was possible to construct portfolios with near perfect correlation that have massive tracking errors, and so investors in these products should be aware of this marketing tactic (or failing, depending on your viewpoint).
Michael should some good examples of how his system had replicated the performance of a particular hedge fund style index, and how this broadly uncovered what kinds of investments were broadly being made by the hedge fund industry during each time period under consideration. He is already doing some work with some regulators on this, but most interestingly he showed how he took a few hedge funds that were later found to be involved in fraudulent activity, and worked backwards to find out what his system thought were the investments being made.
He then showed how by taking away the performance of the replicated fund away from the actual hedge fund results posted, the residual performance for these fraudulent funds was very large, and he implored investors in "stellar" perfoming hedge funds to do this analysis and really quiz the hedge fund manager for where this massive residual performance actually comes from before deciding to invest. In summary a good talk by an interesting speaker, which surprisingly for a New York Quafafew event was not interupted too many times by questions from the hosts.
Posted by Brian Sentance | 10 May 2012 | 7:44 pm
Xenomorph's analytics partner Numerix sponsored a PRMIA event at New York's Harvard Club this week on Credit Valuation Adjustment (CVA). The event also involved Microsoft, with a surprisingly relevant contribution to the evening on CVA and "Big Data" (I still don't feel comfortable losing the quotes yet, maybe soon...). Credit Valuation Adjustment seems to be the hot topic in risk management and pricing at the moment, with Numerix's competitor Quantifi having held another PRMIA event on CVA only a few months back.
The event started with an introduction to CVA from Aletta Ely of JP Morgan Chase. Aletta started by defining CVA as the market value of counterparty credit risk. I am new to CVA as a topic, and my own experience on any kind of adjustment in valuation for instrument was back at JP Morgan in the mid-90s (those of you under 30 are allowed to start yawning at this point...). We used to maintain separate risk-free curves (what are they now?) and counterparty spread curves, which would be combined to discount the cashflows in the model.
Whilst such an adjustment could be calibrated to come up with an adjusted valuation which would be better than having no counterparty risk modelled at all, it seems one of the key aspects of how CVA differs is that a credit valuation adjustement needs to be done in the context of the whole portfolio of exposures to the counterparty, and not in isolation instrument by instrument. The fact that a trader in equity derivatives was long exposure to a counterparty cannot be looked at in isolation from a short exposure to a portfolio of swaps with the same counterparty on the fixed income desk.
Put another way, CVA only has context if we stand to lose money if our counterparty defaults, and so an aggregated approach is needed to calculate the size of the positive exposures to the counterparty over the lifetime of the portfolio. Also, given this one sided payoff aspect of the CVA calculation, then instrument types such as vanilla interest rate swaps suddenly move from being relatively simple instrument that can be priced off a single curve to instruments that needed optionality to be modelled for the purposes of CVA.
So why has CVA become such a hot topic at the banks? Prior to the 2008/2009 crisis CVA was already around (credit risk has existed for a long time I guess, regardless of whether you regulate or report to it), but given that bank credit spreads were at that time consistently low and stable then CVA had minimal effects on valuations and P&L. Obviously with the advent of Lehmans then this changed, and CVA has been pushed into prominence since it has directly affected P&L in a significant manner for many institutions (for example see these FT articles on Citi and JPMorgan)
A key and I think positive point for the whole industry is the CVA requires a completely multi-asset view, and given regulatory focus on CVA and capital adequacy then as a result it will drive banks away from a siloed approach to data and valuation management. If capital is scarcer and more costly, then banks will invest in understanding both their aggregate CVA and the incremental contribution to CVA of a new trade in the context of all exposures to the counterparty. Looking at incremental CVA, then you can also see that this also drives investment into real or near-realtime CVA calculation, which brings me on to the next talks of the evening by Numerix on CVA calculation methods and a surprisingly good presentation on CVA and "Big Data" from David Cox of Microsoft.
Denny Yu of Numerix did a good job of explaining some of the methods of calculating CVA, and in addition to being cross asset and all the implications that requires for having the ability to price anything, CVA is both data and computationally expensive. It requires both simulation of the scenarios for the default of counterparties through time, but also the valuation of cross-asset portfolios at different points in time. Denny mentioned techniques such as American Monte-Carlo to reduce the computation needed through using the same simulation paths for both default scenarios and valuation.
So on to Microsoft. I have seen some appalling presentations on "Big Data" recently, mainly from the larger software and hardware companies try to jump on the marketing band wagon (main marketing premise: the data problems you have are "Big"...enough said I hope). Surprisingly, David Cox of Microsoft gave a very good presentation around the computation challenges of CVA, and how technologies such as Hadoop take the computational power closer to the data that needs acting on, bringing the analytics and data together. (As an aside, his presentation was notably "Metro" GUI in style, something that seems to work well for PowerPoint where the slide is very visual and it puts more emphasis on the speak to overlay the information). David was obviously keen to talk up some of the cloud technology that Microsoft is currently pushing, but he knew the CVA business topic well and did a good job of telling a good story around CVA, "Big Data" and Cloud technologies. Fundamentally, his pitch was for banks and other institutions to become "Analytic Enterprises" with a common, scaleable and flexible infrastructure for data management and analysis.
In summary it was a great event - the Harvard Club is always worth a visit (bars and grandiose portraits as expected but also barber shop in the basement and squash courts in the loft!), the wine afterwards was tolerably good and the speakers were informative without over-selling their products or company. Quick thank you to Henry Hu of IBM for transportation on the night, and thanks also to Henry for sending through this link to a great introductory paper on CVA and credit risk from King's College London. Whilst the title of the King's paper is a bit long and scary, it takes the form of dialogue between a new employee and a CVA expert, and as such is very readable with lots of background links.
Posted by Brian Sentance | 13 April 2012 | 2:56 pm
NoSQL is an unfortunate name in my view for the loose family of non-relational database technologies associated with "Big Data". NotRelational might be a better description (catchy eh? thought not...) , but either way I don't like the negatives in both of these titles, due to aestetics and in this case because it could be taken to imply that these technologies are critical of SQL and relational technology that we have all been using for years. For those of you who are relatively new to NoSQL (which is most of us), then this link contains a great introduction. Also, if you can put up with a slightly annoying reporter, then the CloudEra CEO is worth a listen to on YouTube.
In my view NoSQL databases are complementary to relational technology, and as many have said relational tech and tabular data are not going away any time soon. Ironically, some of the NoSQL technologies need more standardised query languages to gain wider acceptance, and there will be no guessing which existing query language will be used for ideas in putting these new languages together (at this point as an example I will now say SPARQL, not that should be taken to mean that I know a lot about this, but that has never stopped me before...)
Going back into the distant history of Xenomorph and our XDB database technology, then when we started in 1995 the fact that we then used a proprietary database technology was sometimes a mixed blessing on sales. The XDB database technology we had at the time was based around answering a specific question, which was "give me all of the history for this attribute of this instrument as quickly as possible".
The risk managers and traders loved the performance aspects of our object/time series database - I remember one client with a historical VaR calc that we got running in around 30 minutes on laptop PC that was taking 12 hours in an RDBMS on a (then quite meaty) Sun Sparc box. It was a great example how specific database technology designed for specific problems could offer performance that was not possible from more generic relational technology. The use of database for these problems was never intended as a replacement for relational databases dealing with relational-type "set-based" problems though, it was complementary technology designed for very specific problem sets.
The technologists were much more reserved, some were more accepting and knew of products such as FAME around then, but some were sceptical over the use of non-standard DBMS tech. Looking back, I think this attitude was in part due to either a desire to build their own vector/time series store, but also understandably (but incorrectly) they were concerned that our proprietary database would be require specialist database admin skills. Not that the mainstream RDBMS systems were expensive or specialist to maintain then (Oracle DBA anyone?), but many proprietary database systems with proprietary languages can require expensive and on-going specialist consultant support even today.
The feedback from our clients and sales prospects that our database performance was liked, but the proprietary database admin aspects were sometimes a sales objection caused us to take a look at hosting some of our vector database structures in Microsoft SQL Server. A long time back we had already implemented a layer within our analytics and data management system where we could replace our XDB database with other databases, most notably FAME. You can see a simple overview of the architecture in the diagram below, where other non-XDB databases (and datafeeds) can "plugged in" to our TimeScape system without affecting the APIs or indeed the object data model being used by the client:
Data Unification Layer
Using this layer, we then worked with the Microsoft UK SQL team to implement/host some of our vector database structures inside of Microsoft SQL Server. As a result, we ended up with a database engine that maintained the performance aspects of our proprietary database, but offered clients a standards-based DBMS for maintaining and managing the database. This is going back a few years, but we tested this database at Microsoft with a 12TB database (since this was then the largest disk they had available), but still this contained 500 billion tick data records which even today could be considered "Big" (if indeed I fully understand "Big" these days?). So you can see some of the technical effort we put into getting non-mainstream database technology to be more acceptable to an audience adopting a "SQL is everything" mantra.
Fast forward to 2012, and the explosion of interest in "Big Data" (I guess I should drop the quotes soon?) and in NoSQL databases. It finally seems that due to the usage of these technologies on internet data problems that no relational database could address, the technology community seem to have much more willingness to accept non-RDBMS technology where the problem being addressed warrants it - I guess for me and Xenomorph it has been a long (and mostly enjoyable) journey from 1995 to 2012 and it is great to see a more open-minded approach being taken towards database technology and the recognition of the benefits of specfic databases for (some) specific problems. Hopefully some good news on TimeScape and NoSQL technologies to follow in coming months - this is an exciting time to be involved in analytics and data management in financial markets and this tech couldn't come a moment too soon given the new reporting requirements being requested by regulators.
Posted by Brian Sentance | 4 April 2012 | 4:54 pm
Data visualisation has always been an interesting subject in financial markets, one that seems to always have been talked about about as the next big thing in finance, but one that always seems to fail to meet expectations (of visualisation software vendors mostly...). I went along to an event put on by the FT today about what they term "infographics", set in the Vanderbilt Hall at Grand Central Station New York:
One of my first experiences of data visualisation was showing a partner company, Visual Numerix (VNI), around the Bankers Trust 's London trading floor in 1995. The VNI folks were talking grandly about visualising a "golden corn field of trading oportunities, with the wind of market change forcing the blades of corn to change in size and orientation" - whilst maybe they had been under the influence of illegal substances when dreaming up this description, their disappointment was palpable at trading screen after trading screen full of spreadsheets containing "numbers". Sure there was some charting being used, but mostly and understandably the traders were very focussed on the numbers of the deal that they were about to do (or had just done).
I guess this theme ultimately continues today to a large extent, although given the (media hyped) "explosion of data", visualisation is a useful technique for filtering down a large (er, can I use the word "big"?) data problem to get at the data you really want to work with (quick plug - the next version of our TimeScape product includes graphical heatmaps for looking for data exceptions, statistical anomolies and trading opportunities, which confirms Xenomorph buys into at least this aspect of the "filtering" benefits of visualisation).
Coming back to the presentation, Gillian Tett of the FT said at the event today that "infographics" is cutting edge technology - not sure I would agree although given the location some of the images were very good, like this one representing the stock pile of cash that major corporations have been hoarding (i.e. not spending) over recent years:
There was also some "interactive" aspects to the display where by stepping on part of the hall floor changed the graphic displayed. Biggest problem the FT had with this was persuading anyone to step into the middle of the floor to use it (more of an English reaction to such a request, so the reticience from New Yorker's surprised me):
Videos from the presentation can be found at http://ftgraphicworld.ft.com/ and the journalist involved, David McCandless is worth a listen to for the different ways he looks at data both on the FT site but also in a TED presentation.
Posted by Brian Sentance | 27 March 2012 | 4:54 pm
I went along to "Demystifying Financial Services Semantics" on Tuesday, a one day conference put together by the EDMCouncil and the Object Management Group. Firstly, what are semantics? Good question, to which the general answer is that semantics are the "study of meaning". Secondly, were semantics demystified during the day? - sadly for me I would say that they weren't, but ironically I would put that down mainly to poor presentations rather than a lack of substance, but more of that later.
Quoting from Euzenat (no expert me, just search for Semantics in Wikipedia), semantics "provides the rules for interpreting the syntax which do not provide the meaning directly but constrains the possible interpretations of what is declared." John Bottega (now of BofA) gave an illustration of this in his welcoming speech at the conference by introducing himself and the day in PigLatin, where all of the information he wanted to convey was contained in what he said, but only a small minority of the audience who knew the rules of Pig Latin understood what he was saying. The rest of us were "upidstay"...
Putting this in the more in the context of financial markets technology and data management, the main use of semantics and semantic data models seem to be as a conceptual data model technique that abstract away from any particular data model or database implementation. To humour the many disciples of the "Church of Semantics", such a conceptual data model would also be self-describing in nature, such that you would not need a separate meta data model to understand it. For example take a look at say the equity example from what Mike Aitkin and the EDM Council have put together so far with their "Semantics Repository".
Abstraction and self-description are not new techniques (OO/SOA design anyone?) but I guess even the semantic experts are not claiming that all is new with semantics. So what are they saying? The main themes from the day seem to be that Semantics:
- can bridge the gaps between business understanding and technology understanding
- can reduce the innumerable transformations of data that go on within large organisations
- is scaleable and adaptable to change and new business requirements
- facilitates greater and more granular analysis of data
- reduces the cost of data management
- enables more efficient business processes
Certainly the issue of business and technology not understanding each other (enough) has been a constant theme of most of my time working in financial services (and indeed is one of the gaps we bridge here at Xenomorph). For example, one project I heard of a few years back was were an IT department had just delivered a tick database project, only for the business users to find that that it did not cope with stock splits and for their purposes was unusable for data analysis. The business people had assumed that IT would know about the need for stock split adjustments, and as such had never felt the need to explicitly specify the requirement. The IT people obviously did not know the business domain well enough to catch this lack of specification.
I think there is a need to involve business people in the design of systems, particularly at the data level (whilst not quite a "semantic" data model, the data model in TimeScape presents business objects and business data types to the end user, so both business people and technologist can use it without showing any detail of an underlying table or physical data structure). You can see a lot of this around with the likes of CADIS pushing its "you don't need a fixed data model" ETL/no datawarehouse type approach against the more rigid (and to some, more complete) data models/datawarehouses of the likes of Asset Control and GoldenSource. You also get the likes of Polarlake pushing its own semantic web and big data approach to data management as a next stage on from relational data models (however I get a bit worried when "semantic web" and "big data" are used together, sounds like we are heading into marketing hype overdrive, warp factor 11...)
So if Semantics is to become prevalent and deliver some of these benefits in bringing greater understanding between business staff and technologists, the first thing that has addressed is that Semantics is a techy topic at the moment, which would cause drooping eyelids on even the most technically enthused members of the business. Ontology, OWL, RDF, CLIF are all great if you are already in the know, but guaranteed to turn a non-technical audience off if trying to understand (demystify?) Semantics in financial markets technology.
Looking at the business benefits, many of the presenters (particularly vendors) put forward slides where "BAM! Look at what semantics delivered here!" was the mantra, whereas I was left with a huge gap in seeing how what they had explained had actually translated into the benefits they were shouting about. There needed to be a much more practical focus to these presentations, rather than semantic "magic" delivering a 50% reduction in cost with no supporting detail of just how this was achieved. Some of the "magic" seemed to be that there was no unravelling of any relational data model to effect new attributes and meanings in the semantic model, but I would suggest that abstracting away from relational representation has always been a good thing if you want to avoid collapsing under the weight of database upgrades, so nothing too new there I would suggest but maybe a new approach for some.
So in summary I was a little disappointed by the day, especially given the "Demystifying" title, although there were a few highlights with Mike Bennett's talk on FIBO (Financial Instruments Business Ontology) being interesting (sorry to use the "O" word). The discussion of the XBRL success story was also good, especially how regulators mandating this standard had enforced its adoption, but from its adoption many end consumers were now doing more with the data, enhancing its adoption further. In fact the XBRL story seemed to be model for regulators could improve the world of data in financial markets, through the provision and enforcement of the data semantics to be used with each new reporting requirement as they are mandated. In summary, a mixed day and one in which I learned that the technical fog that surrounds semantics in financial markets technology is only just beginning to clear.
Posted by Brian Sentance | 15 March 2012 | 2:58 pm