Why is my backtest overly optimistic?

Laissez Faire · Jun 23, 2020

Every back-test is overly optimistic.

stepandfetchit · Jun 25, 2020

Laissez Faire said:
Every back-test is overly optimistic.

"Every" is a strong word. -- All one has observed, and all one is likely to observe, however is probably accurate.

Same Lazy Element · Jun 25, 2020

Laissez Faire said:
Every back-test is overly optimistic.

I have seen and ran a fair number of strategies that have done better in real life than in the backtest (for various reasons, such as execution assumptions or market improvements).

murray t turtle · Jun 26, 2020

Laissez Faire said:
Every back-test is overly optimistic.

%%
That's strange;
but that direction comes up so much-why?? Well for 2 things; slippage + emotions are not in a back test= different from reaL WORLD/LOL

Nakesha · Jun 27, 2020

I've seen many new traders backtest the day trading with minute level data. It simply doesn't work. You need tick level data and a backend that fills orders at bid/ask prices.

And it's still not 100% accurate because order filling would be simulated and can't never be the same like live trading.

In my experience if you do <50 trades/day with a total amount of <$50K, then backtesting at tick level can be 98% accurate. As soon as you increase the amount (but it depends on stock volume and liquidity) you're going to get farer from that numbers, but still, I'd say, >90% accuracy.

I usually replay all my days with backtesting to check if the live trading was the same or very close to the simulation and that's how I came out with that 98%.

Same Lazy Element · Jun 27, 2020

Nakesha said:
I've seen many new traders backtest the day trading with minute level data. It simply doesn't work. You need tick level data and a backend that fills orders at bid/ask prices.

Unless you wanna play order book games or hold positions for periods comparable to the order book refresh cycles, you don’t need tick level data. Minutely or even 5-min TaQ bars would be plenty for most strategies.

My advice would be to backtest everything assuming some reasonable delay but at mid (you can use micro-mid if you really want). This way, given a viable effect, you can assess the PnL/tradeval as part of your metrics. If PnL/tradeval is not good enough, you can then vary parameters such as signal thresholds to improve it as a stand-alone metric. Since the expectation is that your quality metrics (eg Sharpe and total are PnL) will get worse as you increase the PnL/tradeval, you’re not really curve-fitting your alpha but are still making the strategy more viable.

guowei58 · Jun 27, 2020

nijshar28 said:
UPDATE:

The unrealistic performance I saw was due to a form of look-ahead bias that arose from subtle differences between my backtesting and forward-testing procedures. After fixing the error, my signal dropped to near noise levels. Now I actually have to develop a working strategy.

Thank you so much to everyone who contributed to the thread.

whenever your backtest is showing a sharpe ratio above 5, you have a fundamental error like look-ahead biase. overfitting usually doesn't produce a result that ridiculous.

Nakesha · Jun 27, 2020

Same Lazy Element said:
Unless you wanna play order book games or hold positions for periods comparable to the order book refresh cycles, you don’t need tick level data. Minutely or even 5-min TaQ bars would be plenty for most strategies.

There is no way you can backtest a Scalp-like strategy by using minutes. There are literally thousands price events in just a minute and, depending on the stock, a minute bar can be also >1% of delta, so deciding at what millisecond to enter and exit the trade is absolutely important.

I usually test my strategies with Minutes at first, just because it's much faster, like 1000X faster than tick level. And every single time I moved to the tick level testing all my amazing profits become losses :-)

I'm talking about day trading here, for swing trades or long investment the minute level is absolutely fine.

But really, nobody should go live with his algorithm without have backtest at tick level!

notagain · Jun 27, 2020

Backtest to find the market condition that destroys your strategy. Survival is the first goal. Big losing streaks are not chance, the market has to stop you.

Same Lazy Element · Jun 27, 2020

Nakesha said:
There is no way you can backtest a Scalp-like strategy by using minutes. There are literally thousands price events in just a minute and, depending on the stock, a minute bar can be also >1% of delta, so deciding at what millisecond to enter and exit the trade is absolutely important.

Define "scalp"? If you are trying to find a very short-term mean-reversion strategy where your expected alpha is comparable to the bid/ask, then yes, you have to do your research on tick data. However, unless you are colocated and have made serious technology investments, you are most probably not gonna make money anyway, even if you find a viable effect.

If you are looking for a strategy that holds positions for hours, you are going to be totally OK with minutely TaQ data if you assume a reasonable delay (e.g. use a 1-second delay between your observed and executed prices).

Nakesha said:
But really, nobody should go live with his algorithm without have backtest at tick level!

Actually, I literally never backtest on tick level data despite running a fair number of latency sensitive strategies. It's very hard to simulate your queue priority, impact and realized latency - so either you are too conservative or too optimistic. For medium-frequency trading I use 5-min TaQ bars with 1 second delay between the observed prices and assumed execution. For intraday holding periods I use 1-second TaQ bars with a 100ms delay. I build both from the actual TaQ tick data, so I can vary stuff like delays or type of fields I look at. Finally, for execution-level alphas, I do research on the tick data but I test them in live trading on limited scale.