BMLL brings historical derivatives data to Databricks

15th April, 2026

Zak Jakubowski

The London-based data and analytics company has made its historical market data available via Databricks, giving trading firms direct access to large-scale datasets within their existing analytics environments.

The launch on Wednesday allows users to work with BMLL’s equities, ETFs, futures and options data inside the Databricks platform, reducing integration complexity and accelerating analysis across use cases such as transaction cost analysis, backtesting and market surveillance.

The move reflects growing demand for flexible data delivery as derivatives desks expand quantitative strategies and deploy artificial intelligence across trading workflows.

Paul Humphrey, chief executive of BMLL, said in a release: "Both the sophistication of the market and the demand for high-quality historical market data continue to grow. This is why firms rely on our robust engineering and normalisation processes to make effective use of our granular data, and we are excited to make BMLL datasets available on the Databricks platform.

"We essentially meet our customers where they need us to be, within their existing workflows. They now have additional choice and flexibility in how they access our data to carry out faster and more efficient analysis, at scale."

Humphrey told FOW in an interview that the strategy is driven by client demand rather than a preferred distribution model. "We do not force delivery mechanisms on clients. However they want that data, we try to accommodate. It is on us to make our data available in as many places as we can, not force one method on the customer."

Volatility driving derivatives data demand

The launch comes as demand for derivatives data increases during periods of market stress, with recent geopolitical volatility reinforcing the need for scalable analytics.

Humphrey said activity linked to events such as the US-Iran conflict is directly reflected in data volumes. "When volatile market events like this happen, you can see it in the weight of the data and the amount of activity," he added. "The bigger the data gets, the more difficult it is for people to grapple with it.

"You might see volatility volumes at the sharp end in oil derivatives, but you could say the same about gold or US dollar futures. These markets are interconnected and we see increased demand across the board."

Shift towards deeper market analysis

The expansion also reflects changes in how derivatives desks use historical data, particularly in execution analytics and market structure analysis.

"Historically, people focused on Level 1 data, but more and more customers are looking deeper into the order book activity," Humphrey said. "That is where Level 3 data becomes critical."

This is especially relevant for transaction cost analysis, where understanding order queuing and liquidity dynamics can influence execution quality.

"The limit order book is constantly changing. Unless you have Level 3 data, it is very difficult to understand what is happening beneath the surface and how that impacts execution outcomes," he said.

This shift aligns with broader trends in execution analytics. In previous FOW coverage of BMLL’s partnership with Tradefeedr, Humphrey said: "I think the days of TCA being a checkbox are gone. Customers are looking for execution alpha from it these days."

From delivery to workflow integration

The Databricks integration reflects a broader move towards embedding data within client workflows rather than delivering it as a standalone product.

Humphrey said firms increasingly want to combine internal and external datasets within their own environments. "There are firms that want to merge their private data with market data in an environment they host. Some do that in Snowflake, some in Databricks. It comes back to making sure we are available where our clients are working."

This approach supports more advanced analytics, including combining proprietary trading data with market data for benchmarking and strategy development. It also builds on BMLL’s broader push to deliver consistent, multi-asset datasets across trading desks.

AI accelerating infrastructure change

The move is also linked to growing adoption of artificial intelligence across trading and analytics functions.

Humphrey said removing the data engineering burden is becoming essential. "It is mission critical. The race is on for insight, and firms cannot afford to have their best people cleaning data instead of building models."

He added that the source of competitive advantage has shifted. "Owning the data used to be the advantage. It is not anymore. The edge now comes from what you do with it."

This is driving demand for solutions that provide clean, structured data within environments where models can be deployed quickly.

Cost, scale and efficiency

Handling derivatives data at scale remains a challenge as firms expand coverage across global markets. Humphrey acknowledged the cost involved. "There is no cheap way to do this properly. You need to buy large amounts of historical data, maintain it and process it at scale. It is an expensive business to get off the ground."

However, firms are increasingly focusing on total cost of ownership, including storage, compute and the cost of data engineering.

"Firms are looking at the cost of storing data, the cost of cleaning it and the cost of time. While they are doing that, the market is moving on," he said.

That shift is reinforcing demand for integrated data solutions that reduce operational overhead and accelerate time to insight.

Humphrey said this is changing how market data is consumed. "We are not selling market data. We are selling efficiency."

In February, BMLL and Features Analytics partnered to introduce a new category of surveillance benchmarking tools grounded in reconstructed order book data and AI-driven detection models.