Quote from mdl060374:
Just as a general question, I dont want to have this get away from ACd, what is a decent sample sample size of looking at statistical intraday behavior.
I am thinking 100 tradings days? (5 months of trading) That was it isnt too old to reflect old market environments, but not small enough where you arent getting a decent sample...
This question really doesn't belong in this thread, but because you ask, I will write down 3 guidelines and also my personal approach for proper edge development/avoiding getting fooled by randomness:
1) First and foremost, there must be a plausible proposition to explain why the edge exist. If you can't explain it in words, numbers don't mean anything. Keep researching but don't go live without the understanding.
2) Back-test on at least 100 trades before you you start having 'any' trust in the edge.
3) Test at least 3 yrs of historical data. because many a times, an edge works in 2011 wonderfully, but performs mediocre in 2010 and has 40% DD in 2009. Longer history allows you to see different market conditions. Otherwise, you won't know when actual trading while in a DD, whether to press on with the edge or to stop trading it. Its a real problem, believe me.
4) The way I approach it is really unorthodox, compared to conventional wisdom. I start researching edges by looking at data for last 1 year. I want to explore what is currently working in the market. Then I optimize it. It results into a fantastic system. Now, I use this system and apply it on last 8-10 years of data. Obviously performance is zig-zag, maybe profitable but with large DD. At this stage, I try to 'explain/co-relate' the historical edge performance with the kind of market we were in at that time - high vol/low vol, panic, gradual up-run of equity indices etc. etc. This process helps me in understanding the subtelities of the edge. Now, I build a system that will work for the whole time period from 2001 to 2011. And because of my increased understanding of the edge, it allows me to modify my edge slightly in real-time going forward to match the current environment.
Long post, but if you read it and absorb it, I have given you a very robust method to guard against being fooled by randomness while testing your edges/ideas.
