Bset Number of Trades to Judge System Performance

dima777 · Sep 4, 2008

Quote from Mercor:

It depends on what you want your margin of error to be.

You can use the preisdental polls as a guide, they get a field of about 800 to get their error rate between 2-3%

sounds reasonable given the size of the trading instrument universe and the possible way each of the instruments can behave at any time...

dima777 · Sep 4, 2008

Quote from phattails:

From a practical stand point, you should have enough trades so that your historical sample test is a good approximation for the population. When applying statistics, which in your case is the sample error you must identify the assumptions-- one of which is independence of samples.

If they were independent then you would choose the sample error that suits your confidence requirement. =1/sqrt(n)). So, for a 10% error -- n=10... for a 5% error n=400..

Try to divide your data into subsets of data that you could classify as, for example, nonvolatilie trending, volatile trending, etc. Then you could weight the results of each sample test to reflect the actual results. Don't forget to rule out data that is no longer useful for representation, or create a work around.

the trouble is the population size is extremely large - if you take as a population the size of all possible security setups the trading instruments can go through....

dima777 · Sep 4, 2008

Quote from TSGannGalt:

You have to consider both the time and trades.

1. For something to be statistically viable, academically, you need at least 30+ data samples. I personally like to have more, though.

2. Markets change with time. You can 1000 trades in a day, but that data only considers the market condition of 1 day.

3. It's not only about the frequency of the system you use but type of the system you are testing that matters.

If you're system is parametric then you'll be needing to test in a longer timeframe to test the robustness of the system.

If you are trying to extract a specific tendency or an edge, and testing for it, then you'll be needing more trade samples rather than time. (Time is not weighed larger because edges always fade. To explain it further... You expose to maximize the profit and most importantly you're testing to find the cut off point. You're already considering to cut it out when the edge is gone so testing for robustness of the edge is weighed smaller. Simply, edge is an edge because the tendency being exposed is already defined.)

does this mean that if want to test your newly-found edge you should concentrate on the recent prices? That sounds contrary to some of the opinions in the trading books that you should test your system over a long time back...

TSGannGalt · Sep 4, 2008

Quote from dima777:

does this mean that if want to test your newly-found edge you should concentrate on the recent prices? That sounds contrary to some of the opinions in the trading books that you should test your system over a long time back...

As mentioned... testing a parametric system and a edge-based system is tested differently.

Everyone has their own set of definition of what an edge is...

I wouldn't consider what most people consider an edge (basically, a tested system using statistics that had a positive result). An edge for me is an exploitation of the market structure tendency. For example:

SOES Bandits.
99 looters.
Inter-broker like FOREX pricing arbs.
Millenium Arb.

and etc. etc.

dima777 · Sep 4, 2008

Quote from TSGannGalt:

As mentioned... testing a parametric system and a edge-based system is tested differently.

Everyone has their own set of definition of what an edge is...

I wouldn't consider what most people consider an edge (basically, a tested system using statistics that had a positive result). An edge for me is an exploitation of the market structure tendency. For example:

SOES Bandits.
99 looters.
Inter-broker like FOREX pricing arbs.
Millenium Arb.

and etc. etc.

I apologize but what this list is made of - are these systems?

fifty2aces · Sep 4, 2008

Depends not only on the sample size, but also on the type of strategy. An example would be selling naked OTM puts - you could be profitable over 1000 trades, then blow up spectacularly with one big loser.

GermanTrader · Sep 4, 2008

Quote from dima777:

thanks...but what if the system generates only 10 trades per year?

1. I would not wait more than a year to test any system. Life is too short.
2. I would wait the year because it runs the system through one whole fiscal cycle, including a good sampling of unexpected events and ranges of volatility and volume.
3. I would not test, let alone trade, any system that holds overnight. I don't gamble.

phattails · Sep 4, 2008

Quote from dima777:

I apologize but what this list is made of - are these systems?

These are executional edges. The list of names are parts of the market structure in which they were found.

Businessman · Sep 4, 2008

If your system only gives 10 signals a year.

Test it going back 10 years across 10 markets.
Or 20 years with 5 markets.

The more the better. But 1000 is the minimum, ideally you want several thousand instances in your back test.

phattails · Sep 4, 2008

Quote from dima777:

thanks...but what if the system generates only 10 trades per year?

Should your system work in previous years (based on the fundamental idea driving the system).

If yes, and if you have enough sample trades from past data then test over as many years as needed.

Otherwise look into developing synthetic data.