Market Data and Reference Data 101: A Brief Introduction
The first in a series of brief blogs looking to clarify the basics of financial data management, in this article we discuss what is market data, what is reference data, and look at some of the differences between them. For anyone starting out in the financial information industry, this is intended as a primer to help aid understanding.
What is market data?
This may sound like a tautology, but market data is data that originates from a market. This can cover a very wide range of markets, given that there are many different asset classes (including equities, bonds, rates, currencies, commodities, emissions), instrument types (cash, derivatives, structured products), and market models (order-driven and dealer-led) being operated.
Originally, markets were operated on an open outcry basis, which meant price formation took place on the exchange floor and was communicated out to the broader market through various means (originally by ticker tape and later through various iterations of digital technology). In the present day, most exchange floors have been replaced by electronic matching engines. Order-driven markets typically operate a central limit order book, whose data tends to be licensed on the basis of how much ‘depth’ is offered: ‘Level 1’ data representing the best bid and offer and ‘Level 2’ offering further visibility into the order book. Because data from a central limit order book is ‘executable’, it is typically latency sensitive given that traders want to be the first to react to new information and seize potential arbitrage opportunities.
Dealer-led markets typically provide feeds that are indicative rather than ‘executable’. In order to receive an executable price, a liquidity seeker would typically need to issue a request for quote (RFQ), which would then prompt a dealer (or dealers) to respond with a quote, typically valid for a limited period of time. Alternatively, some markets may also operate a request for stream (RFS) service where a continuously updated quote is streamed to the client enabling it to last for a longer period.
It is important to note that some markets also operate as dark pools, where pre-trade data is deliberately concealed in an attempt to match buyers and sellers covertly to minimize market impact.
What else can be included under market data?
Under its narrowest definition, ‘market data’ is defined as data that originates from financial markets, but a broader definition would also span information about markets. This can include any information used to support trading and investment decisions, and typically found in professional terminals such as those provided by Bloomberg and Refinitiv. Under this broader definition, relevant information includes news, research & ratings, historical charts and analytics, company fundamentals, indexes and benchmarks, fund flows, share lending data, as well as a burgeoning variety of alternative data (for example, satellite images showing car park occupancy as a proxy for the performance of the retail sector).
What is reference data?
Reference data is typically used to support middle- and back-office functions, and includes information such as instrument identifiers (unique codes that are used to identify a security or derivative), market identifiers (where that instrument was traded), and counterparty identifiers (information relating to legal entities that are counterparties to a trade). Another important category of reference data is corporate actions (such as dividend payments in the case of equities, or coupon payments in the case of bonds) as these can be relatively cumbersome to process. Reference data can also include pricing information, although prices are typically provided on a snapshot basis to support specific functions (such as end-of-day portfolio valuations), rather than as a streaming real-time feed in the case of market data.
What is the difference between market data and reference data?
The principal difference between these data sets can be found in the way they are used. Market data is used primarily for decision support – providing information for traders and investors to evaluate whether to buy or sell an instrument. Reference data is used primarily for transaction processing or other middle- and back-office functions (such as end-of-day portfolio valuations, risk reporting, margin calculations etc.). Market data and reference data may also be distributed via different means (market data is typically available as a real-time / streaming feed, while reference data is provided as a batch update or snapshot) and are subject to different licensing terms and conditions.