how can i find a subset of instruments to trade based on minimising instrument correlation without using brute force optimization technique? Say there are 45 instruments in database and I want to pick N.
Right now I'm thinking a clustering algo based on correlation
Interesting question. As mentioned by other posters, it's a tough one.
There are some papers out there suggesting that reducing risk (volatility target) in response to increased correlation between markets produces better results -- I believe that is in one of the Baltas papers. I've not verified this specifically in my system yet.
What I've done so far is something relatively simple conceptually, but was a pain in the butt for me due to the way my backtesting framework is implemented. Essentially, I started with static weights calculated using bootstrapping as outlined by GAT. Then, on every trading day, I "adjusted" (raised or lowered) the weight of a market based on it's "relative correlation" with the system equity curve. To do that, I calculated the correlation (using a relatively short look-back) of each individual market equity curve (trading that single market using the system signals for that market) with the system equity curve (for all markets in my security universe) -- this is why it was a pain in the butt, I had to have a separate equity curve for each market in addition to the equity curve for the complete system trading all markets using the unadjusted weights. Once I had the correlation between each market's equity curve and the system equity curve, I calculated the weight "adjustment" for a market based on the distance of that market's correlation from the average market correlation -- this way, markets for which the correlation between their equity curves and the system equity curve was further away from the average market correlation got a bigger adjustment (you can get into all sorts of non-linear scaling algos here -- I believe I did something based on normal distribution). Markets with smaller correlation than average ended up with a positive adjustment and markets with higher than average correlation would end up with lower weights.
Anyhow, the above did reduce the maximum draw-down by about 10% or so -- from 25% to 22.5% if I remember correctly -- I'm traveling and don't have my notebook with all the notes and results from that experiment, thus I'm going off memory. I don't think the Sharpe ratio was affected much at all -- which I suppose makes sense since you're boosting the weight of loosing markets during times when your overall system is making money.
Anyhow, (as mentioned) while the above approach is simple conceptually, the implementation complexities made it so that I never put it into practice live trading my system.
I have used hierarchical clustering in a GTAA system in the past to aid in portfolio construction, but have not had a chance to try it out on the diversified futures trend and carry system.
I suppose the whole question of whether some sort of dynamic portfolio construction based on changing correlations would produce "better" results comes down to whether changes in correlations have any staying power. For example, using historical instrument volatility to adjust position sizes going forward works because of volatility clustering -- we know that once we get to higher volatility, that higher volatility will stick around for a while on average. Does the same happen with correlations? We know that correlations change in the long run, but do they exhibit clustering in the short run?
There is a paper -- "Momentum and Markowitz: a Golden Combination" -- that I think is very much on point here. The authors of the paper use a much shorter "formation" period than the years and years of returns and correlation data that most academic papers use. The paper seems to suggest that using correlations from much shorter time frames (less than a year) for portfolio construction does produce some impressive results and avoids many of the down-sides of MV optimizations. Alas, I have not had a chance to test out this approach in my system yet since it requires a very efficient optimizer (the paper suggests Critical Line algo), and I've not had the time to implement one myself yet (yes, I know, I could probably use one off the shelf ... I may get there someday).
Anyhow, that's my $0.02.
--Maciej