Technologies derived from the cellular and embedded markets are trickling into trading systems as quants in pursuit of ever-greater performance eschew brute force and embrace crossover tools whose lightweight profiles and high efficiency are major selling points.
Some examples are ARM processors, Field Programmable Gate Arrays (FPGAs), and embedded in-memory database systems (IMDSs). These technologies might have roots in “small” systems (i.e. embedded applications with limited CPU, memory and OS resources), but they can help tackle the computationally intensive challenges of capital markets, and may provide the next big leap in financial system performance, determinism, and cost-efficiency.
Both buy and sell-side firms have historically scaled up processing power or added high performance networking to achieve incremental gains, but costs have risen and bottlenecks persist.
It’s well established that there are diminishing returns on these brute force methods, not to mention increasing data center costs as processors demand more energy and produce more heat. There are also latency penalties that result from data movement through networks and staging data from disk.
ARM processors are commonly found in embedded systems and in smartphones and other popular consumer electronics. The same characteristics that make ARM popular in the consumer market – low power consumption and density – also make them well suited for financial services.
ARM processors show the promise to be extremely scalable when deployed together in server-on-a-chip configurations with ultra-fast high bandwidth interconnects between processors. When used in trading systems, the architecture can reduce data center costs by significantly reducing power and space requirements.
Major OEMs have begun selling server hardware equipped with ARM processors. HP and Calxeda, an ARM chipmaker, announced a partnership that could become an alternative to off-the-shelf x86 servers in the datacenter.
FPGAs are programmable integrated circuits that are energy and space efficient, and accelerate trading by increasing parallelism and eliminating operating system overhead. Vendors typically produce boards and all surrounding circuitry.
Systems are configured through specialised programming languages that are slightly higher level than assembly. Each FPGA can be dedicated to solving a specific problem such as Monte Carlo analysis, and – speaking of “small” – they operate in megahertz instead of gigahertz, and thus are more energy efficient (quants have found that higher clock speed does not always guarantee better performance). Note that this efficiency depends on the application, the algorithms involved, and other factors.
FPGAs perform many more calculations in parallel and eliminate the “jitter” that can intrude when many tasks running on a CPU compete for processing resources. The Linley Group has found that an FPGA can offer 40 times greater performance than a programmable digital signal processor, or 30 times better price/performance ratio. Linley says that the FPGA sector grew 51% in 2010.
The same kind of specialisation is happening in the way data is sorted, stored and retrieved. Today, demanding financial applications are likely to rely on in-memory database systems, which store records directly in main memory to eliminate disk and file I/O, caching and other sources of latency.
Newer to the financial sphere are embedded IMDSs, in which database functions operate within the application process, eliminating separate client and server modules and their inter-process communication (IPC) overhead. In-process is a simpler design, resulting in a shorter IMDS execution path, meaning that fewer CPU instructions are required to carry out a given operation. In isolation, of course, such an instruction requires almost no time.
But in aggregate – e.g., in data-intensive applications that loop as they sort and filter records – the savings in CPU operations can shave milliseconds (or more) from the trade cycle.
This type of solution is used in high-frequency trading systems, ticker plants, order books, matching engines, and more. To handle time series data such as trades and quotes, some IMDSs have integrated support for common statistical functions, as well as columnar data layout.
Other features to look for include support for ACID properties, transaction logging, database replication, and a C/C++ language application programming interface, which further shortens execution path by eliminating the parsing and cost-based optimization steps of the popular SQL API.
A word of caution: some IMDSs replicate disk-based DBMS features, but deploy in RAM and are recast as in-memory databases. They are not optimised for in-memory execution and have very different data flows. For example, a “real” IMDS will optimize CPU usage by not wasting cycles caching what is already in memory.
It might sound counterintuitive, but sheer force alone cannot overcome all challenges. It’s not just server size or network speed that counts. Technologies from this small-footprint, typically-hidden-from-view and highly demanding world of embedded systems are making it possible to better handle the big number crunching challenges of capital markets.
Steve Graves is president and CEO of McObject