I wish I would be able to post here a bit more often, but didn’t have much to put together something meaningful since the last post.
Trying to stay on course of writing about something others didn’t already talk about a great deal. Today’s topic will be
Outcomes Distribution and Outliers.
There is quite a lot of material out there on Monte Carlo simulations and probability of ruin. While they do make sense sometimes, in my own research I noticed that before using them there are way more significant reasons for why system won't not work well in live trading.
One of them is a possibility of deviating results due to too many entries opportunities present at the same time. It won’t be a problem if a trader trades a single instrument as you’re either in or out. Long or short. Pretty simple. For many multi-instrument systems this might happen due to
- Risk management
- Insufficient data resolution
- Data quality
- Very high volatility
- Limitations of your real-time data
- Broker screwing up your order
- You screwing up your order
- Broker outage
- Unable to borrow for shorting
- Unknown unknowns
Let’s say we’re trading a basket of stocks like NASDAQ 100 and using mean reversion long system with LMT order entry, which is very common. Every day we scan them for setups and might get 10-15 candidates for next day trading and then enter if price goes a bit lower next day. But few times a year volatility spikes and most stocks move in the same direction presenting you with 70 setups and 20 or more actually getting filled next day. For most people that would be way more exposure than they would be comfortable having so we limit it to let's say first 5.
Normally there would be some sort of ranking mechanism to handle that in a backtest. One example - prefer ones with largest distance from previous close. If we don’t have intraday data - no way to know which entries will happen first. Even with intraday data its’ resolution might not be enough to say for sure. But no matter how much data we got there is no way to know since you broker might have not accepted your order for whatever reason or your software went down with "out of memory” error and didn’t send it.
While it’s probably impractical to mitigate all possible reasons one thing we can do is to test system robustness in those cases and look at possible outcomes distributions and then try to mitigate the biggest impacting ones with lowest cost of mitigation.
This year's volatility gives plenty of opportunity to do that on recent data.
My friend came up with a randomized entry test which I love. What we would do is
- Look at peaks of potential uncertainty using backtesting software. Plot visually how many setups / fills next day did you have and find large spikes way above your risk parameters. Depending on a software you might need to change your strategy script to temporarily disable number of new/max positions you would take and make set margin available to something very high, so your software doesn’t get rid of trades due to insufficient capital
- Depending on a system you might get few spikes per year or more, or maybe once in 10 years. Modify your entry score / ranking to temporarily use random score. Limit your test dates to 1-2 months period around each of those spike and run a 100 iterations
- Then plot results distribution. Normally you will be looking at return % / drawdown % over the period
It is pretty simple and yet extremely powerful. Could be very eye opening.
Before looking into an example here is few more ideas
- To discover very rare outliers. Some people call left tail risk. Most times when developing a system people run them on “reasonable” number of new/max entries. A rule of thumb is to run not without limits but with ones more than you would be trading in real life. So its enough to see most bad trades. But that still might not show you very rare ones when there were 100 potential entries on one day and one of them lost 100% or maybe 600% in case it’s a short.
- To address this problem I’d run lost of iterations described above as well as few versions with very liberal limits, like 20 new entries, 50, unlimited and see what happened. Normally that happens during periods like 1987, 2008/2009, 2011. If you’re planning to run your systems long term it’s important to find as many as possible “black swans” like that. No guarantees something worse won’t happen but you’re be way more prepared than most professionals out there
He is an example of what I would consider an excellent MR Long system from 1 Feb - 1 May of 2020. Remember that SP500 market went down 35% and is still at a loss YTD.
Here is a backtest for one of the most volatile period in market history Feb 1 - May 1 2020:
Might not look like much for inexperienced traders but note that
- This is ~1:1 return to risk ratio with very small position size and no use of leverage. +4% while loosing ~5%
- One of the worst markets to be long
Either way, looks pretty good to my taste.
But what happens is we run 100 random entries test using EOD data:
Trader A and trader B is mentioned for a reason… This was actually me and my friend who ran very similar version of a system. While some of this can be attributed to a very different execution algos we use - this particular system had pretty wide outcome distribution to begin with.
So where anyone trading it would end up for this period is also somewhat of a luck.
We went different ways about handling this. Which I think both were very viable.
While this time I was on a better side of distribution I decided that am not comfortable with those in a future, so I eventually came up with system version having more a more acceptable range, sometimes giving up a bit of ARR in a backtest. Also checked all other systems for same problem. That proved to be a very good decision so far.
Best predictor of future volatility is current volatility
So those volatility-related deviations have continued in old system version since I made the changes while my live results were very much inline with backtest or better. Which was luck. Or execution. Or both. There is always luck involved.
Takeaways
- You probably wasted your time reading this if you trade a single instrument at any given time
- Consider adding outcomes distribution test into your toolbox and at least stress-testing all your strategies with it
- Consider running your backtest on a few very aggressive max daily/total positions settings including “unlimited"
- If your system is susceptible to a wide range of outcomes - consider changing it in a way that will reduce the range till it’s acceptable to you. Normally you can come up with more strict setup selection on a prior day / consider different ranking algo / increase your data resolution and do more testing to reduce uncertainty / take more but smaller positions
- Study your outlier trades. Especially big losers. Don’t expect you will ever get rid of them by tweaking a system. Most of the time this is just inevitable and it’s better to expect them regardless of what current backtest shows. Testing with unlimited number of entries and looking at outcomes distribution helps to discover hem
- If you don’t have really bad outliers in your backtest - there is probably something wrong with it of your backtest just got lucky. Can’t speak for all instruments but stocks gap down 50% overnight and do go to 0. If you have system with overnight hold and test over sufficient time period - you should see those.
- Consider removing best trades from your back test and see what happens
Val