Hi all,
Maybe I should provide a little perspective to clarify what is going on. It might be helpful to people who like to quantify their tradings as much as possible.
I was determined to trade only on statistics 3 years ago because I didn't think I could beat the market based on my own gut feelings and limited experices (I started trading since 1999). So I downloaded end of data from Yahoo. My specialty is computer science so I have no trouble to build my own backtesting platform to test all kinds of strategies (searching for EDGE). Then I found one really promising. So I traded with it only to find out that the open, high, low, and close data is extremely unreliable. I needed those prices also tradable with at least certain volume. I knew I need higher resolution data to compute such. So I tried to find 1 minute data. The closest one I found back then asked serveral grands. I decided to download them myselft. On 09/2007, I had about 6 months of data on hand and I finally was able to have a system realistic to trade with. I sticked with the system until 03/2008, I deployed a new system. The system performed slightly worse than backtested results, but far better than my trading records before. Then I decided to take care of my own 401K. I directly rolled over my 401K (about $50,720.48) in 07/2008. It is $106,001.62 yesterday. I just did a activity flex from Interactivebrokers and the result is attached for your information.
Since I mainly consider daytrading, I won't even use any data more 9 months old.
Trading is extremely simple and boring if:
1. You developed a system that is fully based on numbers and has a decisive EDGE
2. You use good money management (KELLY)
You do need data to backtest things. I would say, my data should be sufficient if you are develoing a system trading stocks. However, this is NOT to say those data are complete and perfect. In fact, I missed serveral days during the two-year period. My suggestion to those who like to be on the quantitative track: focus on strategy development, backtests, and make your system torlerant to 'bad' and 'incomplete' data. The more complicated data your system can handle, the easier when it goes live.
Now since I have those data on my harddrive, why not try my luck to see if I can get $100 risk free money ($10 for media and shipping)?