Real-world Pairs Trading?
Hi all,
Happy holidays!
I just had a question on pairs trading in real-world. How are pairs trading done in real-world?
My understanding of the procedure:
1. Test stock A using ADF to decide whether A is I(0) or I(1).
2. Test stock B using ADF to decide whether B is I(0) or I(1).
3. Using Johansen or Granger-2-stage methods to decide if A and B are cointegrated.
4. Trade the pair if they are decided to be cointegrated.
My questions are:
1. Are steps 1 and 2 neccessary?
In real-world, most prices series are either I(0) or I(1). And if two stock prices series are I(0) and I(0), we can still do
pairs trading on them, no?
Do we really need both series to be I(1) to do the pairs trading?
2. In steps 1 and 2, there are different versions of the ADF tests: for example, in R, the function ur.df
has "none", "const", "trend", "both" options.
How can we select the best ADF options programatically and automatically, both cross-sectionally and along the time axis?
For example, if we have a large universe of stocks and if we need to do rolling-window ADF tests, choosing the ADF options "automatically"
and "programatically" become tricky...
3. In steps 1 and 2, does the option of "none", "const", "trend" and "both" matter?
For example, if setting "trend" leads to the conclusion of stationary while setting to "const" leads to the conclusion of non-stationary,
should this series be declared as "stationary" or "non-stationary" for trading purposes?
4. I read that cointegration is more or less a long term concept. How "long" is long term here?
Suppose I am doing these tests on 15minute bars, how many data points shall I use in my rolling-window tests along the time axis? I am thinking of 500 data points. But maybe that's too much, remember the markets are changing and we have to be a bit adaptive... Any thoughts on this?
5. Johansen has the advantage of being symmetical in A and B. But Johansen is not stable at all if we look at the rolling-window cointegrated vectors(the eigenvectors).
It seems that to get the hedge ratio, still one needs to use linear regression since it's more stable.
But then would you regress A onto B and regress B onto A? They do make a difference, from my experiment...
Or does that matter?
6. Some literature also mentioned using returns to do all these.
My understanding is that returns are used to find the hedge-ratios approximately.
Ultimately we are still trading prices, those are the tradables. We arenot trading the returns.
So after we do all the tests and obtained hedge ratios using returns or other series, we still come back to prices to form a pair and pairs-trade the price levels...
The hedge-ratio obtained from regressing returns of A onto returns of B is an approximation to the hedge-ratio obtained from regressing prices of A onto prices of B.
When the prices of A and B are I(1), regressing prices onto prices will lead to spurious regression, but the estimate of Beta (the hedge-ratio) itself shouldn't be a problem.
It's the inference that is messed up.
Am I understanding this correctly?
Thanks a lot!
[CPed on Willmot etal]
Hi all,
Happy holidays!
I just had a question on pairs trading in real-world. How are pairs trading done in real-world?
My understanding of the procedure:
1. Test stock A using ADF to decide whether A is I(0) or I(1).
2. Test stock B using ADF to decide whether B is I(0) or I(1).
3. Using Johansen or Granger-2-stage methods to decide if A and B are cointegrated.
4. Trade the pair if they are decided to be cointegrated.
My questions are:
1. Are steps 1 and 2 neccessary?
In real-world, most prices series are either I(0) or I(1). And if two stock prices series are I(0) and I(0), we can still do
pairs trading on them, no?
Do we really need both series to be I(1) to do the pairs trading?
2. In steps 1 and 2, there are different versions of the ADF tests: for example, in R, the function ur.df
has "none", "const", "trend", "both" options.
How can we select the best ADF options programatically and automatically, both cross-sectionally and along the time axis?
For example, if we have a large universe of stocks and if we need to do rolling-window ADF tests, choosing the ADF options "automatically"
and "programatically" become tricky...
3. In steps 1 and 2, does the option of "none", "const", "trend" and "both" matter?
For example, if setting "trend" leads to the conclusion of stationary while setting to "const" leads to the conclusion of non-stationary,
should this series be declared as "stationary" or "non-stationary" for trading purposes?
4. I read that cointegration is more or less a long term concept. How "long" is long term here?
Suppose I am doing these tests on 15minute bars, how many data points shall I use in my rolling-window tests along the time axis? I am thinking of 500 data points. But maybe that's too much, remember the markets are changing and we have to be a bit adaptive... Any thoughts on this?
5. Johansen has the advantage of being symmetical in A and B. But Johansen is not stable at all if we look at the rolling-window cointegrated vectors(the eigenvectors).
It seems that to get the hedge ratio, still one needs to use linear regression since it's more stable.
But then would you regress A onto B and regress B onto A? They do make a difference, from my experiment...
Or does that matter?
6. Some literature also mentioned using returns to do all these.
My understanding is that returns are used to find the hedge-ratios approximately.
Ultimately we are still trading prices, those are the tradables. We arenot trading the returns.
So after we do all the tests and obtained hedge ratios using returns or other series, we still come back to prices to form a pair and pairs-trade the price levels...
The hedge-ratio obtained from regressing returns of A onto returns of B is an approximation to the hedge-ratio obtained from regressing prices of A onto prices of B.
When the prices of A and B are I(1), regressing prices onto prices will lead to spurious regression, but the estimate of Beta (the hedge-ratio) itself shouldn't be a problem.
It's the inference that is messed up.
Am I understanding this correctly?
Thanks a lot!
[CPed on Willmot etal]