backtest for 3 years, blow up in 3 days,

heech · Sep 11, 2009

Quote from intradaybill:

It is bad if the trading system designer was not intending of doing that but the software did it behind his back. That is the point. I think it is explained in that paper very well.

I don't think the paper explained that very well at all, but I certainly agree with you on this point.

You better be clear as to the behavior that you're backtesting, so that you implement the strategy correctly real-time. Since we have no idea what software he's testing, we have no idea how the strategy was implemented/tested.

intradaybill · Sep 11, 2009

Quote from heech:

I don't think the paper explained that very well at all, but I certainly agree with you on this point.

I disagree. I think the author explains that very well. You should read again the conclusion:

"Discussion of the results

...The backtesting flaws inherent in the specific versions of the two programs may have caused many technical traders to quickly discard certain systems that actually were promising and use others that appeared to satisfy certain performance characteristics but in reality that was not the case. This type of flaws can also have a severe impact on certain trend-following systems, as it was shown with an example. Traders should have been made aware of this sort of backtesting flaws resulting essentially in hidden conditions becoming part of the analysis. I cannot speculate about the reasons for these flaws since any speculation of this sort is beyond the nature and reach of the empirical study of this paper.

In the presence of backtesting flaws, the following are possible:
(1) Systems that show satisfactory performance during backtest may be adopted for actual trading where in reality their performance may be significantly degraded after corrections for skipped signals and wrong exit dates are made. In this case a trader may select a system believing it has satisfactory hypothetical historical performance when it actually does not.
(2) Systems that show an unsatisfactory performance during a backtest and are discarded may turn out to have acceptable actual performance if corrections to the backtest results are made to compensate for the impact of the flaws. In this case, a trader may discard a system that actually has some true potential.

Under actual trading conditions the situation may develop as follows. A trader, unaware of the flaws, backtests her idea and finds out that it produced acceptable performance according to her criteria. Then she programs the system in an API of a brokerage firm for sending the trades automatically. The API program will not share the same reservations about placing orders as the backtesting program, and it will generate orders at the close of a bar where an order in the same direction (long or short) was previously closed. The trader believes that the actual performance her system will statistically match the results of the backtest. However, one reason that may not happen and the system may turn out to lose money in actual trading is that the backtested system and the API system are two different systems, although the trader believes they are the same. This again may happen due to the hidden conditions that were introduced during the backtest, like the ones identified in this paper. Of course, it may also happen due to other factors that are not a subject of this study, such as slippage, partial order fills and other random effects.
I do not know whether any previous versions of the programs considered in this paper had the same backtesting flaws and when those were corrected."

ref: http://www.tradingpatterns.com/About_Us/articles/backtesting/Backtesting.pdf

heech · Sep 11, 2009

I just re-read it. I don't see any mention of the possibility that the strategy might intentionally be delaying the stop/target. For all we know, the backtesting tool he's playing with has a specific option for "same bar stop/target" that he hasn't enabled.

There are a thousand different possible backtesting decisions that have to be understood. I think it's excessive to blow this up as a 'flaw'. What about handling of slippage? Should that be a constant value, or should it be dependent on the "velocity" of price movements at the time of the stop? There are a lot of considerations out there.

Anyways, my trading code is the same as my backtesting code. Anyone else doing backtesting should, as previously mentioned, make sure they know exactly what strategy they're backtesting.

intradaybill · Sep 11, 2009

Quote from heech:

I just re-read it. I don't see any mention of the possibility that the strategy might intentionally be delaying the stop/target. For all we know, the backtesting tool he's playing with has a specific option for "same bar stop/target" that he hasn't enabled..

Which backctesting tooll is that? I don't think it is something that is currently available.

I also do not understand what you mean by "the possibility that the strategy might intentionally be delaying the stop/target". It is clear that the strategy he provides as an example had no such intention but those programs delayed the stop/target AND, this is even more important, missed several entries.

I think the point is that software should do what it is supposed to do, not something else. If you have a compiler that sets A = 0 although you declare A = 1 you have a problem.

If you backtest a system manually like in the paper and then use one of those systems and find half of the trades are wrong, you have a problem.

Anyways, enough of this. Let's try to make some money today.

Eight · Sep 11, 2009

Quote from intradaybill:

I think part of the conclusion can be found in this paper from another thread. The author of the paper, a well-known expert in this field, claims that two widely used backtesters have significant defects and produce wrong results because of basic errors in their algorithms

Backtesting Flaws in two Popular Programs

That paper refers to programs popular in the late 90's..

I still find ambiguities between how programs handle data for backtesting and how they handle data for realtime trading.. if you start to even smell that they are different, run, don't walk, away.. don't return...

I'm sure I had one of those programs mentioned in the paper, I wouldn't want to name names but the initials would be Tradestation 2000, I think it was out before the year 2000... I wasted a lot of time with it.. it's quite possible that crapware backtesters are the single biggest obstacle I've encountered in my journey of about a decade... I'm migrating to some stuff that doesn't seem to be such crapware... the thing to look for is the programmer's ability to fix things.. if they aren't writing maintainable code then that is a major problem, they can never get it fixed... you have to sniff that out, is it buggy, does it crash, can they fix things quickly, do they not fix things over time, does each version break something new, are there big delays in releases... I've worked with great programmers and I've worked with lots of mediocre ones.. there is a huge difference.. if a team is led by a great programmer probably they won't head into the badlands without a map, otherwise, forget it, it's a hell of a world out there if your code is crapware.. a hell of a world where you continually have to bullshit the customers...

ronblack · Sep 12, 2009

Quote from Eight:

That paper refers to programs popular in the late 90's..

It appears so...and at that time the web was not developed to a point where users could exchange ideas in newsgroups and become aware of those issues fast.

IMO the other two posters discussing this paper missed the main point. The author actually tests a very simple system manually and then he finds out that he cannot replicate the results in those backtesting platforms. Assuming he used the correct code and tester options, the results point to crappy software. Someone should contact this author and ask him to provide the full details on a nondisclosure basis. Then, a team of experts should go ahead and try to replicate the results of his study. If the results are confirmed, there may be valid grounds for a class action against the vendors.

Eight · Sep 12, 2009

In recent years I was shuffling from one software package to another.. finally I made a spreadsheet with all the available packages listed and I noted features and whatnot... after I rejected crapware either by reviews, clues, or personal experience and rejected some more on lack of features... I have nothing left that really does what I want

There is one that I can probably live with if not downright enjoy, I have to adopt some crude methods for testing but hey... crude is fine, it's all about getting from point A to point B, not about the journey at all, I'm not a fricking Buddhist Monk, I'm an engineer on a mission...

I'm modifying my methodology to fit the software currently and keeping my fingers crossed that something horrible doesn't pop up and go "GOTCHA!!!!!!"

JaiSreeram · Sep 15, 2009

Quote from Eight:

In recent years I was shuffling from one software package to another.. finally I made a spreadsheet with all the available packages listed and I noted features and whatnot... after I rejected crapware either by reviews, clues, or personal experience and rejected some more on lack of features... I have nothing left that really does what I want There is one that I can probably live with if not downright enjoy, I have to adopt some crude methods for testing but hey... crude is fine, â¦â¦â¦â¦â¦â¦

Eight, you have indeed done an interesting study.

Is it possible to post the results of your study? Thanks!

TraderSystem · Sep 16, 2009

Quote from intradaybill:

I have no idea since the author was careful to hide their identity. I can only guess but that is not good enough.

It boils down that many people have been taken for a ride by companies that did not do basic testing before selling their software.

Thanks for the reply, intradaybill

A lot of interesting posts have also come up.

chipmunk · Sep 16, 2009

if you threw darts at quotes and took random entries the only way you could "blow out" in 3 days is to gamble. IE overtrade.