HFT Myths

Thanks for answer.
Could you please tell what do you think is the difference between taking and making strategies? I mean for example why you as you wrote doing mostly making strategies?
Do they require different type of price prediction or do taking strategies require better speed? Why can not you run taking strategies on same price prediction and do the same volume of taking as making?
Thank you for your answers
I like making markets because that's what I got started in and that's what I'm used to. There's inherent money in making markets because we are providing a service and people are willing to cross the bid/ask spread to pay up for that service, which I've learned to provide efficiently.

In pursuing taking strategies, speed is perhaps more important, but it's also easier to be faster because you're managing less orders, so it's not the main issue for me. The big difference, at least for me, is that you need stronger alpha signals that can make up for crossing the bid/ask spread, and that your risk profile is completely different. The income stream becomes much more volatile on a daily basis, but you avoid the fat tails of taking on unwanted inventory when market-making. I'm used to the latter and know how to alleviate and/or deal with it, but not comfortable with the former.
 
So if, TOTAL_LATENCY = EXCHANGE_TO_YOU + PROCESSING_TIME + YOU_TO_EXCHANGE, you only bother measuring PROCESSING_TIME (if I'm understanding correctly). I'm sure much of what you do is aimed at reduced the processing figure, but without constant monitoring of the other two (especially EXCHANGE_TO_YOU), how do you know if you're having network problems or the like? Perhaps a better question - how do you know that a signal is still valid and not stale? For example, say I trade a simple model where after Event1 on the exchange I have x micros to get my order on the book - after this point, positive expectancy is gone. If you don't know how long it's been since Event1 occurred, how can you determine whether sending the order makes sense?

I realize that you can get TOTAL_LATENCY figures by receiving a timestamped exchange event, sending an order in response, and then taking the difference in exchange timestamps for your order hitting the book and the original event. However, that is still after the fact and doesn't isolate each element of non-processing time.

EXCHANGE_TO_YOU <your recv time stamp - market data time stamp>
You could use this as a relative latency and if you see the number spike or deviate from the rolling median time you could have logic to not act on the quote, but personally I trigger an alert so you can troubleshoot. I would only do this if you have a VERY reliable feed handler and network connection.

PROCESSING_TIME <order send time stamp - recv time stamp>
This is probably the most important latency since you control everything in here. If your average processing time is 50 microseconds to see a quote, make a decision and send an order but after you made a code change your average went to 20 milliseconds, there is a problem.

YOU_TO_EXCHANGE <exchange/broker ack - order send time>
While you can control your network latency to the exchange there is only so much you can do. If you have the budget and already have a 10G/40G cross connect in their colocation data center and maybe you payed extra for their fastest market data feed. You have almost no control over exchange matching engine latency. If you detect gateway latency at the exchange side due to your own high order flow you could always order additional gateway sessions at a cost and split your load. YMMV

At a few exchanges you can calculate
> your order/trade to quote latency using their market data feeds
> matching engine to exchange market data publisher
> exchange market data publisher to you

You also have to put it into perspective that if you have 500 microsecond jitter on your exchange ack time but your total latency is over 50 milliseconds, you have bigger fish to fry.
 
Yep. Prices, sizes at prices, trades, and time. It bewilders me how many ways you can amalgamate those individual variables into alphas.

well,
that's a big surprise to me. does it mean the behaviour on a short time scale (let's say less than 30 sec) is only governed
by the stock itself, and therefore the rest of the market has a little or no impact on the short term behaviour.
So, would you say that orderbook data of a single stock is enough to develop a stable market making activity on this stock ?

because i thought that stocks were highly driven by by index futures even on the tick by tick basis.
And therefore futures had a great importance in the model ... and extending the universe of assets impacting your stock,
you end up following a large universe of assets in order to quote a single stock ... am i that wrong ?


That varies a lot, both in terms of what's analyzed and its stability. Some settings change on every tick, others haven't changed in years. Most adjustments to models are made somewhere between the tick and daily timeframe, though some parameters admittedly get lost in the shuffle and don't change for years.

But, by stable over years, you mean the parameters are constant-ish over years or the way you calculate them is stable over years,
To be more precise, for example : you calculate everyday a parameter using 2 rolling month of data, but maybe a year ago you were using let's say 3 months.
in the end the way is stable, but the value isn't. again, am i wrong here ?


cheers
 
So if, TOTAL_LATENCY = EXCHANGE_TO_YOU + PROCESSING_TIME + YOU_TO_EXCHANGE, you only bother measuring PROCESSING_TIME (if I'm understanding correctly). [] ... ha.

In the end, all I can trade off are the prices I get when I get them. To estimate latency, you have to get the official tick-stream and compare to when you receive quotes in realtime. Just pick a good symbol, such as SPY or ES, and log incoming ticks for a few minutes.

Secondly, you need to determine your order latency to your broker. The minimum latency is 2*the ping time (best lower bound I've found). Just measure time from sending an order to receiving confirmation back, and you'll know your brokers overhead relative to the raw transmission speed of your network connection.

You have to know these things, if you expect to run fast.
 
I'll break this up into 3 separate issues we're addressing:

1) Am I only concerned with processing time (internal latency) and disregarding external latency (me to exchange and exchange to me)?
No, I measure external latency as well. But instead of breaking it up into 2 separate components (meToExchange and exchangeToMe), I lump it into one measurement (basically OrderSent and OrderAckReceived), that's measured at the same source (me). This way, there are measures that I know I can take to reduce both internal latency (writing more efficient code) and external latency (optimizing interaction with the exchange).

2) How do I know/guess at when events happened in 'real' time at the exchange? Or in other words, how do I know the staleness of data I'm receiving and processing?
One close guess, which might be good enough for you, is to look at the lag between the time you receive a fill and the time it's reflected in the pricefeed. Add half a round-trip time to that and that gives you a pretty decent approximation of how long ago events happened that you're just seeing now. You're making a few assumptions (namely that the lag is consistent, and FillNotification-->PublicLastTrade receipt lag is indicative of the lags in other market data as well), but it's certainly a start.

3) How do I decide if a signal is good any more if I can't judge its true age?
Simply put, I take the time I receive the signal as the starting time for my signal analysis. So if I think I have a good signal that tells hit the market whenever I receive signal X, and I can effectively model whether my order will get filled, I analyze the PNL/predictivity of signal X that way, without caring about the actual time at which X occurred on the exchange.


In your example, if you know you have X micros after Event1 happened on the exchange where it's predictive, can't you change your 'happened on the exchange' to 'received the information'? If your basis is that the time from event happening to market data receipt is inconsistent and you need to get a good model on that inconsistency then I'd understand where you're coming from, and there are exchanges where that applies. If that's the case, I'd try to model lags of the signals you're interested in by trying to simulate them yourself. For example, if your signal is based on LastTrades, look at the Fill-->LastTrade lag that I mentioned earlier). If your signal is based on some big size entering the book, look at the lag between the time you send an order and when it's reflected in the pricefeed. If you give me a specific example we can throw some ideas around, it could be a learning experience for me too.

Before addressing any points specifically, I should make it clear that we are operating in two wildly different worlds when it comes to infrastructure (I should have led with this in my original post, but honestly I didn't fully appreciate the significance of those differences until you (and baglunch) got into your answers). If I'm not mistaken, you are receiving raw exchange feeds and writing directly to the exchange with your orders. I, on the other hand, get my equities data from a 3rd party feed handler that consolidates the raw feeds for me...then I send orders through a broker (complete with credit check process). None of this prevents me from measuring my external latency as [OrderMsgInFeedReceived - OrderSent - InternalLatency] as you do, but I know that the latency in my feed can vary wildly.

The difference in the consistency of our feeds is where our realities really diverge. The assumption of consistent data lag seems to underly your various latency measurements. That affords you the ability to not question the staleness of messages until your [OrderMsgInFeedReceived - OrderSent] measurements go out of spec and indicate that something might be wrong. With the std dev of my latency, I can't do that and for every message need to answer "how old is this Event1 message?". There two ways of addressing this that I can think of - each with their own problems:

1. Get a better (more consistent) feed

-The obvious solution is to process the raw feeds myself and simply mimic what you're doing. The issue with that is I'm a one man operation who lacks the time, technical expertise, and likely budget for that. Plus, my modest aim here is to have ~10 millisecond external latency. So it probably makes sense to step down a rung and cross connect to someone's 3rd party feed handler (Exegy, ITG, Active, RealTick, etc.). In order to test if one of these solutions is consistent and gets me under 10ms, I need to set up the cross connect, implement the API, and then send orders and calculate [OrderMsgInFeedReceived - OrderSent] on a large sample. This is doable but will eat up a lot of time (and maybe even money depending on the number of feeds that would need to be tested before finding something suitable).

2. Attempt to use official time

-Given that my goal is a relatively modest 10 milliseconds of external latency, can I get my clock accurate enough to be useful? Are the exchange clocks accurate enough to serve my purposes? If each of our clocks differ from official time by only a millisecond or two, then that's fine for my purposes and I can continue with my [MyTime_MsgReceived - ExchangeTimestamp] measurements. Problem here is that I have no idea how far off exchanges clock are (microsends? milliseconds?) or what getting my clock to within a millisecond or two of offical time would entail.

Anyhow, apologies for turning this into my own little trading workshop but I needed to put my earlier comments into the proper perspective. As always, appreciate any input/advice.
 
EXCHANGE_TO_YOU <your recv time stamp - market data time stamp>
You could use this as a relative latency and if you see the number spike or deviate from the rolling median time you could have logic to not act on the quote, but personally I trigger an alert so you can troubleshoot. I would only do this if you have a VERY reliable feed handler and network connection.

PROCESSING_TIME <order send time stamp - recv time stamp>
This is probably the most important latency since you control everything in here. If your average processing time is 50 microseconds to see a quote, make a decision and send an order but after you made a code change your average went to 20 milliseconds, there is a problem.

YOU_TO_EXCHANGE <exchange/broker ack - order send time>
While you can control your network latency to the exchange there is only so much you can do. If you have the budget and already have a 10G/40G cross connect in their colocation data center and maybe you payed extra for their fastest market data feed. You have almost no control over exchange matching engine latency. If you detect gateway latency at the exchange side due to your own high order flow you could always order additional gateway sessions at a cost and split your load. YMMV

At a few exchanges you can calculate
> your order/trade to quote latency using their market data feeds
> matching engine to exchange market data publisher
> exchange market data publisher to you

You also have to put it into perspective that if you have 500 microsecond jitter on your exchange ack time but your total latency is over 50 milliseconds, you have bigger fish to fry.

Thanks for the input. As I replied above, the lack of consistency in my feed makes much of this difficult.
 
well,
that's a big surprise to me. does it mean the behaviour on a short time scale (let's say less than 30 sec) is only governed
by the stock itself, and therefore the rest of the market has a little or no impact on the short term behaviour.
So, would you say that orderbook data of a single stock is enough to develop a stable market making activity on this stock ?

because i thought that stocks were highly driven by by index futures even on the tick by tick basis.
And therefore futures had a great importance in the model ... and extending the universe of assets impacting your stock,
you end up following a large universe of assets in order to quote a single stock ... am i that wrong ?


But, by stable over years, you mean the parameters are constant-ish over years or the way you calculate them is stable over years,
To be more precise, for example : you calculate everyday a parameter using 2 rolling month of data, but maybe a year ago you were using let's say 3 months.
in the end the way is stable, but the value isn't. again, am i wrong here ?


cheers
Data from a single stock is enough to make a market in that stock. A stock trades at a dozen exchanges though, so that's already a lot of data to consume. Futures do drive stocks, but in a pure single symbol market-making strategy, the signal that I can use from the futures contract is dwarfed by the alphas from that stock's orderbook data across its traded exchanges. Remember that in market-making I usually gross negative and net positive from rebates, so it's probably a very different ballgame than what you're doing. Also I need to reiterate that equities is not my strong point; baglunch seems to have more experience in this arena than I do based on his previous comments.

About parameters' stability over years, that was meant as more of a dig on people forgetting to reanalyze their parameters. As you sort of alluded to, the parameters I'm talking about being that stable are something like EMA_WINDOW=1 minute. That EMA window should really be re-analyzed and adjusted at some regular interval but there are many strategies where that falls by the wayside and is kept constant for years. I think what you were getting at is whether the value is constant, which in most cases is not the case.
 
HFT, do you arb SPY on pre-open?
No. If a quote is untradeable against, it is garbage. I rest orders in pre-open in many markets, but that is to establish queue position, not trade against what's showing.
 
Remember that in market-making I usually gross negative and net positive from rebates, so it's probably a very different ballgame than what you're doing. Also I need to reiterate that equities is not my strong point; baglunch seems to have more experience in this arena than I do based on his previous comments.

What about in the case of futures, where you're not getting rebates?
 
Back
Top