Pairs Trading in Practice

It is perhaps a little premature for a deep dive into the Gemini Pairs Trading strategy which trades on our Systematic Algotrading platform. At this stage all one can say for sure is that the strategy has made a pretty decent start - up around 17% from October 2018. The strategy does trade multiple times intraday, so the record in terms of completed trades - numbering over 580 - is appreciable (the web site gives a complete list of live trades). And despite the turmoil through the end of last year the Sharpe Ratio has ranged consistently around 2.5.
Gemini.png

One of the theoretical advantages of pairs trading is, of course, that the coupling of long and short positions in a relative value trade is supposed to provide a hedge against market downdrafts, such as we saw in Q4 2018. In that sense pairs trading is the quintessential hedge fund strategy, embodying the central concept on which the entire edifice of hedge fund strategies is premised.
In practice, however, things often don't work out as they should. In this thread I want to spend a little time reviewing why that is and to offer some thoughts based on my own experience of working with statistical arbitrage strategies over many years.

Methodology
There is no "secret recipe" for pairs trading: the standard methodologies are as well known as the strategy concept. But there are some important practical considerations that I would like to delve into in this post. Before doing that, let me quickly review the tried and test approaches used by statistical arbitrageurs.

The Ratio Model is one of the standard pair trading models described in literature. It is based in ratio of instrument prices, moving average and standard deviation. In other words, it is based on Bollinger Bands indicator.

  • we trade pair of stocks A, B, having price series A(t), B(t)
  • we need to calculate ratio time series R(t) = A(t) / B(t)
  • we apply a moving average of type T with period Pm on R(t) to get time series M(t)
  • Next we apply the standard deviation with period Ps on R(t) to get time series S(t)
  • now we can create Z-score series Z(t) as Z(t) = (R(t) - M(t)) / S(t), this time series can give us z-score to signal trading decision directly (in reality we have two Z-scores: Z-scoreask and Z-scorebid as they are calculated using different prices, but for the sake of simplicity let's now pretend we don't pay bid-ask spread and we have just one Z-score)
Another common way to visualize this approach is to think in terms of bands around the moving average M(t):

  • upper entry band Un(t) = M(t) + S(t) * En
  • lower entry band Ln(t) = M(t) - S(t) * En
  • upper exit band Ux(t) = M(t) + S(t) * Ex
  • lower exit band Lx(t) = M(t) - S(t) * Ex
These bands are actually the same bands as in Bollinger Bands indicator and we can use crossing of R(t) and bands as trade signals.

  • We open short pair position, if the Z-score Z(t) >= En (equivalent to R(t) >= Un(t))
  • We open long pair position if the Z-score Z(t) <= -En (equivalent to R(t) <= Ln(t))
PairsTrade.png


In the Regression, Residual or Cointegration approach we construct a linear regression between A(t), B(t) using OLS, where A(t) = β * B(t) + α + R(t)

Because we use a moving window of period P (we calculate new regression each day), we actually get new series β(t), α(t), R(t), where β(t), α(t) are series of regression coefficients and R(t) are residuals (prediction errors)

  • We look at the residuals series R(t) = A(t) - (β(t) * B(t) + α(t))
  • We next calculate the standard deviation of the residuals R(t), which we designate S(t)
  • Now we can create Z-score series Z(t) as Z(t) = R(t) / S(t) - the time series that is used to generate trade signals, just as in the Ratio model.
The Kalman Filter model provides superior estimates of the current hedge ratio compared to the Regression method. For a detailed explanation of the techniques, see the following posts (the second one contains complete Matlab code).

Kalman1.png

https://bit.ly/2IiwQLT

ETF Pairs.png

https://bit.ly/2Na2eLu

Finally, the rather complex Copula methodology models the joint and margin distributions of the returns process in each stock as described in the following post:

Copulas.png

https://bit.ly/2DLBahk
 
I know about this strategy well...the idea is good. However, it's hard to make profit using this concept because of excess co-movement among stocks, commodities and forex. This strategy is built on a concept that prices follow a random walk (hence, as a result, prices reflect efficient prices and will converge back to its fundamental prices; however, that is not the case in the real world, as prices are mostly driven for liquidity premium) You may see a margin call even before you realize it...I am afraid
 
Last edited:
Where can I read up more on applying Kalman Filter to estimation and prediction for options and correlation trading? I just completed an introductory online class on MATLAB programming and am looking for some way to try it out.

Thanks in advance.
 
Where can I read up more on applying Kalman Filter to estimation and prediction for options and correlation trading? I just completed an introductory online class on MATLAB programming and am looking for some way to try it out.

Thanks in advance.
Something less technical and easier for non techie to understand.
 
Something less technical and easier for non techie to understand.

You're probably going to have to put this together yourself. Start with the Wiki:

https://en.wikipedia.org/wiki/Kalman_filter

an algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe
So it's an algorithm that can deal with noise and more accurately estimate a multi-variable (joint) probability distribution than other algorithms.

You can apply this algorithm to price series to try and smooth out the noise. Theoretically then, once enough input data has been collected, you can get a more accurate prediction for the next time step. Practically though, based on my 30 seconds of investigation, I doubt it works at that granular of a level, only over the long run.

I have no idea whether any of this is true, but it's helpful to be able to quickly learn technical concepts. The quants (hi @sle) will know intimate details but you don't need to know this kind of thing to evaluate the usefulness of a concept.
 
Last edited:
It is perhaps a little premature for a deep dive into the Gemini Pairs Trading strategy which trades on our Systematic Algotrading platform. At this stage all one can say for sure is that the strategy has made a pretty decent start - up around 17% from October 2018. The strategy does trade multiple times intraday, so the record in terms of completed trades - numbering over 580 - is appreciable (the web site gives a complete list of live trades). And despite the turmoil through the end of last year the Sharpe Ratio has ranged consistently around 2.5.
View attachment 198043
One of the theoretical advantages of pairs trading is, of course, that the coupling of long and short positions in a relative value trade is supposed to provide a hedge against market downdrafts, such as we saw in Q4 2018. In that sense pairs trading is the quintessential hedge fund strategy, embodying the central concept on which the entire edifice of hedge fund strategies is premised.
In practice, however, things often don't work out as they should. In this thread I want to spend a little time reviewing why that is and to offer some thoughts based on my own experience of working with statistical arbitrage strategies over many years.

Methodology
There is no "secret recipe" for pairs trading: the standard methodologies are as well known as the strategy concept. But there are some important practical considerations that I would like to delve into in this post. Before doing that, let me quickly review the tried and test approaches used by statistical arbitrageurs.

The Ratio Model is one of the standard pair trading models described in literature. It is based in ratio of instrument prices, moving average and standard deviation. In other words, it is based on Bollinger Bands indicator.

  • we trade pair of stocks A, B, having price series A(t), B(t)
  • we need to calculate ratio time series R(t) = A(t) / B(t)
  • we apply a moving average of type T with period Pm on R(t) to get time series M(t)
  • Next we apply the standard deviation with period Ps on R(t) to get time series S(t)
  • now we can create Z-score series Z(t) as Z(t) = (R(t) - M(t)) / S(t), this time series can give us z-score to signal trading decision directly (in reality we have two Z-scores: Z-scoreask and Z-scorebid as they are calculated using different prices, but for the sake of simplicity let's now pretend we don't pay bid-ask spread and we have just one Z-score)
Another common way to visualize this approach is to think in terms of bands around the moving average M(t):

  • upper entry band Un(t) = M(t) + S(t) * En
  • lower entry band Ln(t) = M(t) - S(t) * En
  • upper exit band Ux(t) = M(t) + S(t) * Ex
  • lower exit band Lx(t) = M(t) - S(t) * Ex
These bands are actually the same bands as in Bollinger Bands indicator and we can use crossing of R(t) and bands as trade signals.

  • We open short pair position, if the Z-score Z(t) >= En (equivalent to R(t) >= Un(t))
  • We open long pair position if the Z-score Z(t) <= -En (equivalent to R(t) <= Ln(t))
View attachment 198044

In the Regression, Residual or Cointegration approach we construct a linear regression between A(t), B(t) using OLS, where A(t) = β * B(t) + α + R(t)

Because we use a moving window of period P (we calculate new regression each day), we actually get new series β(t), α(t), R(t), where β(t), α(t) are series of regression coefficients and R(t) are residuals (prediction errors)

  • We look at the residuals series R(t) = A(t) - (β(t) * B(t) + α(t))
  • We next calculate the standard deviation of the residuals R(t), which we designate S(t)
  • Now we can create Z-score series Z(t) as Z(t) = R(t) / S(t) - the time series that is used to generate trade signals, just as in the Ratio model.
The Kalman Filter model provides superior estimates of the current hedge ratio compared to the Regression method. For a detailed explanation of the techniques, see the following posts (the second one contains complete Matlab code).

View attachment 198045
https://bit.ly/2IiwQLT

View attachment 198047
https://bit.ly/2Na2eLu

Finally, the rather complex Copula methodology models the joint and margin distributions of the returns process in each stock as described in the following post:

View attachment 198046
https://bit.ly/2DLBahk
Do the returns quoted take costs into account?. If so, what commissions are assumed?
 
Returns are net of commissions at $0.005 per share. In our account we pay less than this ($0.001 per share), but we average across several live accounts in which the strategy currently operates.
 
Returns are net of commissions at $0.005 per share. In our account we pay less than this ($0.001 per share), but we average across several live accounts in which the strategy currently operates.

Is this day trading or position trading. What is the average hold time?
 
Hi Robert,

So this version of the strategy trades intraday. Hold times varying from minutes to days, depending on the pair. So there is overnight risk. But margin utilization is very efficient due to dollar neutrality of the portfolio.

For purely intraday trading our focus is more on HFT and market making.
 
Back
Top