Fully automated futures trading

My heart says equal weights. My head says the new method is correct. My gut, which is very pragmatic, says it won't matter very much eithier way. What are the two assets, out of curiousity?

GAT

They are not assets per se, but rule variations. If I recall correctly, they are ewmac 2x8, normalised momentum 2x8 and breakout 10.

BTW I have added a "PS" part in the previous post about the method sensitivity to the clustering algorithm.
 
This is what I am looking at. If I run a clustering algorithm on pooled returns for rules, I get the following 9 clusters (3x3, I named the top 3):

-- fast rule cluster --
1 momentum2
1 normmom2
1 breakout10

2 momentum4
2 normmom4

3 breakout20

-- medium speed rule cluster ---
1 momentum8
1 normmom8
1 breakout40

2 momentum16
2 normmom16

3 breakout80

-- slow rule cluster --
1 momentum32
1 normmom32
1 breakout160

2 normmom64
2 momentum64

3 breakout320


But if I did this manually, I'd choose the following clustering - the 3 top clusters would be by rule type and each rule cluster would have 2 clusters, 1 for faster and 1 for slower variations:
--- momentum rule ---
1 momentum2
1 momentum4
1 momentum8

2 momentum16
2 momentum32
2 momentum64

--- normalised momentum rule ---
1 normmom2
1 normmom4
1 normmom8

2 normmom16
2 normmom32
2 normmom64

--- breakout rule ---
1 breakout10
1 breakout20
1 breakout40

2 breakout80
2 breakout160
2 breakout320


If I ran correlation algorithms on both clustering styles, I'd get very different weights.
Or am I overthinking this too much?
 
This is what I am looking at. If I run a clustering algorithm on pooled returns for rules, I get the following 9 clusters (3x3, I named the top 3):

-- fast rule cluster --
1 momentum2
1 normmom2
1 breakout10

2 momentum4
2 normmom4

3 breakout20

-- medium speed rule cluster ---
1 momentum8
1 normmom8
1 breakout40

2 momentum16
2 normmom16

3 breakout80

-- slow rule cluster --
1 momentum32
1 normmom32
1 breakout160

2 normmom64
2 momentum64

3 breakout320


But if I did this manually, I'd choose the following clustering - the 3 top clusters would be by rule type and each rule cluster would have 2 clusters, 1 for faster and 1 for slower variations:
--- momentum rule ---
1 momentum2
1 momentum4
1 momentum8

2 momentum16
2 momentum32
2 momentum64

--- normalised momentum rule ---
1 normmom2
1 normmom4
1 normmom8

2 normmom16
2 normmom32
2 normmom64

--- breakout rule ---
1 breakout10
1 breakout20
1 breakout40

2 breakout80
2 breakout160
2 breakout320


If I ran correlation algorithms on both clustering styles, I'd get very different weights.
Or am I overthinking this too much?

So, the data is telling you that faster momentum rules have more in common with each other than they do with their variations. Is the right response to ignore this, or to listen to it? (I don't actually know if there is a right response, but it's interesting information none the less).

Would you really get very different weights? IDM does some of the heavy lifting in balancing out different clustering results. More to the point, even if the weights really were very different, I'd bet serious money that the backtested SR came out pretty similar.

I'm not saying you're overthinking, as I think it's important to understand what's going on under the hood and you probably (like me!) find this intellectually interesting. Just to warn that it's unlikely anyone will get any serious alpha from tinkering with optimisation. Get a set of reasonably diversified rules. Get a set of diversified instruments (most important of all). Get weights that are roughly correct. And that is as good as it gets.

GAT
 
So, the data is telling you that faster momentum rules have more in common with each other than they do with their variations. Is the right response to ignore this, or to listen to it? (I don't actually know if there is a right response, but it's interesting information none the less).

Would you really get very different weights? IDM does some of the heavy lifting in balancing out different clustering results. More to the point, even if the weights really were very different, I'd bet serious money that the backtested SR came out pretty similar.

I'm not saying you're overthinking, as I think it's important to understand what's going on under the hood and you probably (like me!) find this intellectually interesting. Just to warn that it's unlikely anyone will get any serious alpha from tinkering with optimisation. Get a set of reasonably diversified rules. Get a set of diversified instruments (most important of all). Get weights that are roughly correct. And that is as good as it gets.

GAT

Yes, it's 90% intellectual curiosity and 10% need to construct a portfolio I could use in live trading.

I'll do weightings of both clustering hierarchies and compare the weights. Perhaps you are right and there won't be much difference. I will use the new method with the preferred parameters of 4 x stdev of correlation estimation, 500 data points (~10 years of weekly data) and min weight adjustment.

I also have an additional idea I wish to test. The new method is not limited to 3 asset problem. I'm thinking of expanding it to 6 assets and running the weighting for clusters of all 6 variations per rule style. It does increase computing time, but not to astronomical levels. That would be interesting to compare against the above clustering schemes.

P.S. I do agree that it will very likely have no discernible difference in the actual long term backtest results.
 
Last edited:
Hello Robert,

I've been looking at one particular example of correlations between 3 assets:
0.9976, 0.9417, 0.9333 (AB, AC, BC)

A and B is almost 100% correlated. It makes sense that the largest weight should go to C. Additionally, the correlations are quite high and based on a lot of data, therefore the uncertainty is relatively low.

According to the original handcrafting method, the correlations are rounded to [0.9 0.9 0.9] and the optimal weights are [0.333 0.333 0.333]

However with the new method we get the following (500 data points ~10 years):

>>> apply_min_weight(optimised_weights_given_correlation_uncertainty(three_asset_corr_matrix(labelledCorrelations(0.9976, 0.9417, 0.9333)), 500))
array([0.21472084, 0.28415071, 0.50112846])


This option returns quite different weights [0.21 0.28 0.50] than the original equal weights result.

Which one would you prefer and why?

P.S.
If we cluster A B and C hierarchically into two groups, we'd get one group [A B] and the other [C]. Then we'd get the weights [0.25 0.25 0.5]. This shows how sensitive the method is to the clustering outcomes.
(I have encountered many more and better examples in my research, where a slight difference in clustering causes not so slight differences in weights)
Alternatively you could say "why bother?" if the correlation between A, B and C is so high. Just choose one of them could be a simple strategy.
 
This is what I am looking at. If I run a clustering algorithm on pooled returns for rules, I get the following 9 clusters (3x3, I named the top 3):

-- fast rule cluster --
1 momentum2
1 normmom2
1 breakout10

2 momentum4
2 normmom4

3 breakout20

-- medium speed rule cluster ---
1 momentum8
1 normmom8
1 breakout40

2 momentum16
2 normmom16

3 breakout80

-- slow rule cluster --
1 momentum32
1 normmom32
1 breakout160

2 normmom64
2 momentum64

3 breakout320


But if I did this manually, I'd choose the following clustering - the 3 top clusters would be by rule type and each rule cluster would have 2 clusters, 1 for faster and 1 for slower variations:
--- momentum rule ---
1 momentum2
1 momentum4
1 momentum8

2 momentum16
2 momentum32
2 momentum64

--- normalised momentum rule ---
1 normmom2
1 normmom4
1 normmom8

2 normmom16
2 normmom32
2 normmom64

--- breakout rule ---
1 breakout10
1 breakout20
1 breakout40

2 breakout80
2 breakout160
2 breakout320


If I ran correlation algorithms on both clustering styles, I'd get very different weights.
Or am I overthinking this too much?

Would some sort of dimension reduction like PCA be useful to get further insights? (Or did you do that for the clustering part?)
 
Just did the first comparison: calculating weights (with diversification multipliers applied) for clustering generated by the algorithm (hierarchical complete-linkage clustering with max cluster size = 3) VS manual clustering using human judgement and warm feelings towards neat and clean categories:

clustering_weights_comparison.png


It appears that the final weighting is very sensitive to the clustering output.

In the next post I will check if there is any sort of significant differences in SR between the two outputs.
 
Would some sort of dimension reduction like PCA be useful to get further insights? (Or did you do that for the clustering part?)

Do you mean applying PCA to the correlation matrix?
I did not do it. The clustering is hierarchical complete-linkage algorithm with max cluster size of 3 (to accommodate the handcrafting "candidates").
 
Auto VS Manual clustering weightings turned out quite different. However when checking basic stats, I'm getting:

auto clustering:
mean = 5.6671
stdev = 168.80
mean / stdev = 0.033573


manual clustering:
mean = 5.6206
stdev = 172.09
mean / stdev = 0.032661


equal weights:
mean = 4.3331
stdev = 135.27
mean / stdev = 0.032034


Based on 47793 data points - weekly returns for 30 years for 37 futures markets. No trading costs or SR adjustment done.

It does look like auto clustering gives the best result (in the backtest...), but is it significantly better, that's another question.
 
Back
Top