Why is my backtest overly optimistic?

murray t turtle · Jun 21, 2020

stepandfetchit said:
If you provide some details, it is possible flaws can be identified, however without specific information, we can only speculate. 1st speculation: What product are you trading and are you using a reasonable slippage?

%%
Common pattern/why??
Emotions goof more or less with real money.
Most forget dividends/mistake;
+ market does not exactly repeat.And ttraders have more slippage commissions/less profit generally speaking.
Can get positive slippage selling into a superstrong uptrend but most days are not like that + most daytraders are afraid of an uptrend……………………………………………………………………….

murray t turtle · Jun 21, 2020

southall said:
The main cause of overfitting is too many filter parameters.

Its always tempting to add filters that reduce losses in the past and make the system look great on past data.

%%
True;
unless you have so many it turns your trading signal into a bull market investment/LOL. But I would not blame a multitude of filters on that; its simply /trend/friend/uptrend/less slippage with less trading
I could use a goofy odd ma like 65 0r 62 + make those work with discretion. But a 65 dma is too little to late for me...……………………………………………………………………………………………….
I think a rsi is the most useless indicator on the face of the earth/trending market But an above average /jack schwager hedge fund guy/ like$ it for bottoms/LOL so good for him. Good thing about being over62 years old you forget useless stuff like rsi……………………………………...

Same Lazy Element · Jun 21, 2020

ionone said:
fitting doesn't depend on the number of parameters, it depends whether those parameters have been chosen to be optimal during a selected period.
you can overfit a MA with only 1 period parameter.

Not sure how you came up with this idea. The more parameters there are, the easier it is to overfit the strategy. There are even theoretical ways to prove it, if you assume that your noise process follows some set distribution.

In any case, double digit Sharpe on a medium frequency strategy definitely screams "forward information" and not much else.

userque · Jun 21, 2020

nijshar28 said:
Hey guys. I have been backtesting and paper-trading my strategy for a few months now. I can't help to notice that my backtest looks more optimistic than my paper trading results.

Even before I started paper trading, I suspected that something is off about my backtest as it exhibits double digit Sharpe ratios, virtually no downside risk, and returns that are frankly unrealistic.

I tried running some tests, like introducing programming assertions that look for lookahead bias and interchanging blocks of code between my backtesting and forward testing software. Even though I flushed out some minor issues this way, unfortunately, I still cannot pinpoint the main problem.

Has someone encountered this before? I feel that any advice would be helpful at this point. Thank you.

Does your strategy have any parameters. If so, how many; and how did you determine the value of the parameters ... via "back-tests?"

If so, it sounds you have a case of over-fitting.

nijshar28 · Jun 22, 2020

userque said:
Does your strategy have any parameters. If so, how many; and how did you determine the value of the parameters ... via "back-tests?"

If so, it sounds you have a case of over-fitting.

It certainly seems so. I don't get how the system can be overfitted so badly though, given that I am backtesting in a walk-forward fashion. I don't think the number of parameters is the problem. I tried reducing the number of variables and although performance decreases slightly, I still get a very strong signal.

At this point, I think I am either committing some very basic error that I am somehow unaware of (e.g. survivorship bias, walk forward analysis set up incorrectly, etc). Or there're some additional trading costs I experience during my forward-test (slippage, commissions, stock borrowing) that I am not modeling correctly in my backtest.

userque · Jun 22, 2020

nijshar28 said:
It certainly seems so. I don't get how the system can be overfitted so badly though, given that I am backtesting in a walk-forward fashion.

A neural network can 'walk forward' / backtest with lottery data, and overfit very badly.

I don't think you've revealed enough to receive a reliable response, or maybe I missed it. What type of algorithm are you using (NN, SVM, Decision Trees, Markov Chains, kNN, Etc.)? I'm not talking about revealing your secret sauce.

How are you walking forward? How are you optimizing? Etc.

nijshar28 said:
I don't think the number of parameters is the problem. I tried reducing the number of variables and although performance decreases slightly, I still get a very strong signal.

That can also happen with the lottery NN I mentioned above. Reducing can mean going from 1000 to 500. 500 may still be a lot. We'd ... er ... I'd have to know more before I could possibly respond confidently.

nijshar28 said:
At this point, I think I am either committing some very basic error that I am somehow unaware of (e.g. survivorship bias, walk forward analysis set up incorrectly, etc). Or there're some additional trading costs I experience during my forward-test (slippage, commissions, stock borrowing) that I am not modeling correctly in my backtest.

Based on what I know about what you're doing, and, more importantly, on what I don't know about what you're doing,

my first guess is: overfitting;
my second guess is a data leak.

Keep us posted! and Good Luck!

nijshar28 · Jun 22, 2020

Hey. So I was just going over my forward / backtesting results from today.

I noticed that at least today the execution played a big role in the discrepancy between the two.

In my backtest, I assume that I get filled on the open. The way I rationalize this expectation is that I can submit orders into the opening auction when live trading. Feel free to dash my hopes on this one.

Anyway, in my forward test today, I tried to send simulated MKT orders on open, which resulted in a 0.0006 price degradation per transaction on average (relative to the daily opening prices I get from quandl and use in my backtest).

That price degradation alone was sufficient to put my daily P&L in the red, plus the commissions piled on top of that.

Should I try a different kind of order when trying to get filled on the open. A limit order pegged to the opening price?

Thank you.

nijshar28 · Jun 23, 2020

UPDATE:

The unrealistic performance I saw was due to a form of look-ahead bias that arose from subtle differences between my backtesting and forward-testing procedures. After fixing the error, my signal dropped to near noise levels. Now I actually have to develop a working strategy.

Thank you so much to everyone who contributed to the thread.

Same Lazy Element · Jun 23, 2020

Told ya

userque · Jun 23, 2020

A.K.A. a data leak.

Glad you found it! And thanks for the update!