Nanex Research

Nanex ~ 30-Sep-2013 ~ HFT Front Running, All The Time

The High Frequency Trading (HFT) firm Virtu published a response to our analysis, Einstein and The Great Fed Robbery, in which we showed that the Fed FOMC news had to exist in New York and Chicago before it was released at 2pm in Washington D.C. We responded with Shredding Virtu's Response with Science, which showed that Virtu's own data agreed with our findings. This is our second response, because we found something even more disturbing than the fact that some HFT were secretly trading on Fed news earlier than physically possible.

Virtu's paper exposed an egregious practice concerning market data that we and others in the industry have been telling the regulators, and anyone who would listen, for years. This is the first time a HFT has come clean and revealed with evidence, that subscribers to direct feeds (primarily HFT) regularly receive quote and trade data faster than other market participants. Don't regulations prohibit this? You bet they do, which is why we were shocked by Virtu's admission. We hope the regulator is paying attention.

If people are upset that HFT traded on Fed news a few milliseconds before everyone else, we are sure they will be outraged to learn that HFT receives market data milliseconds earlier, practically all the time.

Before we begin, there are a few acronyms you need to know. The SIP (Security Information Processor) is also called the Consolidated Tape, Consolidated Data, Consolidated Feed, the Network Processor, CQS, CTA, and UQDF. To keep things simple, we'll use the term SIP throughout (except when quoting external sources). There are over 2.5 million subscribers paying the exchanges about $500 million a year for SIP data. They pay this valuable consideration with the expectation of receiving comprehensive, accurate, real-time prices for stocks: unfortunately, as Virtu's paper makes clear, they aren't getting any of that.

Virtu's paper used a question and answer format, where they pose a question and then answer it. Virtu's first question and answer:

What market data should be used for this analysis?

First, when analyzing time stamps of events with microsecond granularity, you have to examine the accuracy of the source of the data and the time stamps appended to that data. For our source of data, we look at the equities exchanges in New York (technically operating out of data centers in New Jersey). It is important to observe the market data that is released directly from the exchanges. These feeds are the most direct and accurate feeds to replay actual market activity. The exchanges also pool all of their best prices together in a separate market data feed known as the “SIP” which is used for broad, public dissemination and regulatory purposes. The SIP is a consolidator of the direct market data feeds and as a result of this consolidation process it is not reliable for discerning millisecond differences in trades. Unfortunately, it appears that the Nanex study relies entirely on the inaccurate SIP market data time stamps.

For exchanges and HFT, Virtu's answer is full of landmines: according to Virtu, the SIP:

is not the most accurate data feed.
is used for regulatory purposes.
is unreliable for determining when, or in what order trades actually occurred.
has inaccurate timestamps.

First, let's read what the SEC, the government agency that regulates companies such as Virtu, wrote about the SIP in Reg NMS (page 30):

When Congress mandated a national market system (“NMS”) for trading securities in 1975, it emphasized that consolidated data “would form the heart of the national market system.” The Commission since has emphasized the importance of the consolidated data feeds on many occasions, including in its January 2010 Market Structure Concept Release: “As a result, the public has ready access to a comprehensive, accurate, and reliable source of information for the prices and volume of any NMS stock at any time during the trading day. This information serves an essential linkage function by helping assure that the public is aware of the best displayed prices for a stock, no matter where they may arise in the national market system.

So why doesn't everyone just use direct data feeds? Because of the cost.

A direct feed to just one exchange can cost over $10,000 a month - in exchange fees alone. You need a direct feed from each (there are 14 lit markets, and dozens of dark pools), plus at least one good network engineer on staff, plus $10's of thousands in equipment. Add in another $10,000 or more in monthly telco fees. A great real-world example of the costs involved is the SEC's recent purchase of the MIDAS system - it costs them millions of dollars a year in data processing costs alone. This is a prime reason why Congress mandated a SIP in the first place, and why it forms the heart of Reg NMS. This is also why Reg NMS clearly states that exchanges may not provide data to direct feeds faster than the SIP: otherwise everyone would have to spend a fortune just to get accurate prices.

Straight from Reg NMS (page 278):

Rule 603(a)(2) requires that any SRO, broker, or dealer that distributes market information must do so on terms that are not unreasonably discriminatory. These requirements prohibit, for example, a market from making its "core data" (i.e., data that it is required to provide to a Network processor) available to vendors on a more timely basis than it makes available the core data to a Network processor.

Another question and answer from Virtu's paper:

What time did equity trading actually begin after the Fed announcement?

Relying on our two sources of market data timestamps, our records show that the first trade immediately following the Fed release at 2 PM in SPY occurred at 2:00:00.000397 on the BATS Exchange (100 shares at $170.82). The timestamp embedded in the message by BATS was 2:00:00.000 (meaning it could have been between 0 and 999us after 2 PM). BATS can likely confirm the exact time. The Nanex study indicates that the first trade in SPY occurred at 2:00:00.001 on Nasdaq BX (referred to in the Nanex charts as “Bost” because it was formerly the Boston Stock Exchange). This Nanex data claims the first BATS trade in SPY occurred 2:00:00.011. The Nanex data is clearly wrong because it is reflecting timestamps after the consolidation process has been completed by the SIP. With respect to GLD, our records show that the first trade immediately following the Fed release at 2 PM occurred at 2:00:00.000331 on the BATS Exchange (200 shares at $126.83). The timestamp embedded in the message by BATS was 2:00:00.000 (due to rounding). This compares to the Nanex data which reflects the first trade in GLD occurring at 2:00:00.001 on Nasdaq and NYSE Arca.

First, let us be clear: we always use SIP data and timestamps for analysis whenever possible (on purpose). Using SIP data allows anyone to reproduce our findings, including the regulator; after all regulations are based on the SIP. When Virtu writes "the Nanex data is clearly wrong" they are really saying that the SIP - the data feed mandated by Congress to be accurate, reliable, and at the heart of stock market regulations - is wrong. There is no way for anyone to verify that Virtu's data is accurate, while anyone can verify SIP data. How can the regulator possibly enforce or judge where best routing practices are employed when the SIP (the public reference) is admitted to be wrong and inaccurate?

How does a regulator create a speed limit that is unenforceable?

Second, if the consolidated data has latencies (is delayed) relative to direct feeds, it will only mean that trading happened even earlier than we first reported. We prefer to err on the conservative side in our publications, but are pleased that Virtu points out the Fed leak was even earlier than we suspected.

How Delayed was the SIP?

Now, let's find out just how much faster exchanges sent data to direct feeds than to the SIP. Keep in mind that these trades were at the very beginning of an explosion of trading activity, so the exchange networks at that time were free and clear of message traffic and should have performed optimally. If we compare data just 10 milliseconds later, we would expect to find significantly larger delays to the SIP. We think the regulator should use MIDAS and determine just how delayed SIP data was during this very important trading time.

Times are in milliseconds after 2pm. We include an official SIP sequence number to make it easier for the regulators to find.

Symbol (Seq#) Direct Feed Virtu SIP SIP Delay

SPY (135958) 0.000 0.397 11.000 10.603

GLD (284272) 0.000 0.331 22.000 21.669

According to Virtu (a HFT), the SIP was delayed a whopping 22 milliseconds versus the direct feed! From decades of experience working with market data, and from SIP latencies published by the exchanges, this number is 22 times higher than expected. Clearly, exchanges are sending core information to direct feeds way ahead of the SIP.

Required Regulator Response

This discovery requires a response from the regulator, the SEC. What are they likely to say about this? Let's review what the SEC wrote after fining the NYSE $5 million for giving market data to direct feeds faster than to the SIP:

Improper early access to market data, even measured in milliseconds, can in today's markets be a real and substantial advantage that disproportionately disadvantages retail and long-term investors," said Robert Khuzami, Director of the SEC's Division of Enforcement. "That is why SEC rules mandate that exchanges give the public fair access to basic market data. Compliance with these rules is especially important given exchanges' for-profit business interests".

Another quote, emphasizing trade-through price protection above the need for faster trading. To ensure trade-through price protection, the SIP must provide accurate, real-time data. Reg NMS (page 410):

The Reproposing Release touched on this issue in the specific context of assessing the effect of the Order Protection Rule on the interests of professional traders in conducting extremely short-term trading strategies that can depend on millisecond differences in order response time from markets. Noting that any protection against trade-throughs could interfere to some extent with such short-term trading strategies, the release framed the Commission's policy choice as follows: "Should the overall efficiency of the NMS defer to the needs of professional traders, many of whom rarely intend to hold a position overnight? Or should the NMS serve the needs of longer-term investors, both large and small, that will benefit substantially from intermarket price protection?" 16 The Reproposing Release emphasized that the NMS must meet the needs of longer-term investors, noting that any other outcome would be contrary to the Exchange Act and its objectives of promoting fair and efficient markets that serve the public interest.

Conclusion

Virtu's paper has inadvertently made it clear that serious regulation violations infest our stock market. Violations which undermine not only the regulator's authority but also the perception of fairness from investors or anyone who believes that the SIP is giving them accurate and timely stock prices. While Virtu's paper expresses an attitude towards the SIP that is common among high frequency traders and exchanges, that doesn't change the fact that there are clear regulations protecting those that rely on the SIP. We think if you don't like the rules, then work to change them. The fact that the SEC recently quoted the relevent sections of Reg NMS when fining the NYSE, tells us the regulator is not only aware of the rules, but is willing to enforce them. After all, the SIP forms the heart of Reg NMS, and when the heart dies, so will the patient.

Nanex Research

Inquiries: pr@nanex.net

Symbol (Seq#)	Direct Feed	Virtu	SIP	SIP Delay
SPY (135958)	0.000	0.397	11.000	10.603
GLD (284272)	0.000	0.331	22.000	21.669