I've been reading up on trading systems and optimization, and the one thing I've been confused about was the apparent dangers of curve fitting.
I understand the concept where you can over-optimize a system to the point where it works great for an entire range of data, but then it doesn't work in real life, because the optimization was optimized for the specific data you ran against.
I'm trying to gain a bit more insight into why this is a problem though. At the heart of it, all optimization is some type of curve-fitting, so isn't that risk always inherent?
I'm guessing the scenario where curve fitting is the worst is where you take all your backtesting data, and then optimize the system so that it produces the best results, leaving you with no further data to test. The problem with testing against all your data is that you could have a couple of months where you get tremendous returns, and the rest of the time you get really bad returns, but the bad performance would be hidden by the fact you are optimizing by the end result, ie. the total amount you've made, instead of breaking up the data or optimizing on a smaller timeframe.
If you optimize it against a subset of data, and then test it against the rest of your backtest data, is this what is known as walking forward?
How I test my systems are to test against all my backtest data, but breaking down their results on a per-month or per-week basis, and then throwing out particularly good months and evaluating the performance with what remains. Is what I'm doing in essence the same as "walking forward"? Or are there other inherent problems with my methodology?
I understand the concept where you can over-optimize a system to the point where it works great for an entire range of data, but then it doesn't work in real life, because the optimization was optimized for the specific data you ran against.
I'm trying to gain a bit more insight into why this is a problem though. At the heart of it, all optimization is some type of curve-fitting, so isn't that risk always inherent?
I'm guessing the scenario where curve fitting is the worst is where you take all your backtesting data, and then optimize the system so that it produces the best results, leaving you with no further data to test. The problem with testing against all your data is that you could have a couple of months where you get tremendous returns, and the rest of the time you get really bad returns, but the bad performance would be hidden by the fact you are optimizing by the end result, ie. the total amount you've made, instead of breaking up the data or optimizing on a smaller timeframe.
If you optimize it against a subset of data, and then test it against the rest of your backtest data, is this what is known as walking forward?
How I test my systems are to test against all my backtest data, but breaking down their results on a per-month or per-week basis, and then throwing out particularly good months and evaluating the performance with what remains. Is what I'm doing in essence the same as "walking forward"? Or are there other inherent problems with my methodology?