0

Open Source, Next Generation Data Encoding

Today is an exciting day for technology in high performance electronic trading. By the time you read this, the CME Group, Real Logic Ltd., and Informatica will have announced a new open source initiative. I’ve been collaborating on this work for a few months and I feel it is some great technology. I hope you will agree.

Simple Binary Encoding (SBE) is an encoding for FIX that is being developed by the FIX protocol community as part of their High Performance Working Group. The goal is to produce a binary encoding representation suitable for low-latency financial trading. The CME Group, Real Logic, and Informatica have sponsored the development of an open source implementation of an early version of the SBE specification undertaken by Martin Thompson (of Real Logic, formerly of LMAX) and myself, Todd Montgomery (of Informatica). The implementation methodology has been a very high performance encoding/decoding mechanism for data layout that is tailored to not just high performance application demands in low-latency trading. But has implications for all manner of serialization and marshaling in use cases from Big Data analytics to device data capture.

Financial institutions, and other businesses, need to serialize data structures for purposes of transmission over networks as well as for storage. SBE is a developing standard for how to encode/decode FIX data structures over a binary media at high speeds with low-latency. The SBE project is most similar to Google Protocol Buffers. However, looks are quite deceiving. SBE is an order of magnitude faster and immensely more efficient for encoding and decoding. This focus on performance means application developers can turn their attention to the application logic instead of the details of serialization. There are a number of advantages to SBE beyond speed, although, speed is of primary concern.

  • SBE provides a strong typing mechanism in the form of schemas for data objects
  • SBE only generates the overhead of versioning if the schema needs to handle versioning and if so, only on decode
  • SBE uses an Intermediate Representation (IR) for decoupling schema specification, optimization, and code generation
  • SBEs use of IR will allow it to provide various data layout optimizations in the near future
  • SBE initially provides Java, C++98, and C# code generators with more on the way

What breakthrough has lead to SBE being so fast?

It isn’t new or a breakthrough. SBE has been designed and implemented with the concepts and tenants of Mechanical Sympathy. Most software is developed with abstractions to mask away the details of CPU architecture, disk access, OS concepts, etc. Not so for SBE. It’s been designed with Martin and I utilizing everything we know about how CPUs, memory, compilers, managed runtimes, etc. work and making it very fast and work _with_ the hardware instead of against it.

Martin’s Blog will have a more detailed-oriented, technical discussion sometime later on SBE. But I encourage you to look at it and try it out. The work is open to the public under an Apache Public License.

Find out more on the FIX/SBE specification and SBE on github.

———————————————–

Todd Montgomery

Todd L. Montgomery is a Vice President of Architecture for Informatica and the chief designer and implementer of the 29West low latency messaging products. The Ultra Messaging product family (formerly known as LBM) has over 190 production deployments within electronic trading across many asset classes and pioneered the broker-less messaging paradigm. In the past, Todd has held architecture positions at TIBCO and Talarian as well as lecture positions at West Virginia University, contributed to the IETF, and performed research for NASA in various software fields. With a deep background in messaging systems, high performance systems, reliable multicast, network security, congestion control, and software assurance, Todd brings a unique perspective tempered by over 20 years of practical development experience.

FacebookTwitterLinkedInEmailPrintShare
This entry was posted in Banking & Capital Markets, Big Data and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>