Quote from Wide Tailz:
That's a very interesting point. Great observation.
I have a type 1 sytem that was trained on about 22000 data points and has 5 degrees of freedom.
I changed some of the parameters and got a much better fit for the most recent 1700 data points, but the system was flat from any point before that. I debated on just refitting the system every n datapoints but threw the idea away as being too delicate for the real world.
"Ultimately discretionary"
Won't know if it's the right decision until the next 1700 data points are captured.
Regarding the data-to-parameter ratio: I think since financial data contain a large amount of correlation, N data points are not really "worth" N data points -- as you point out, I think it's the independent degrees of freedom or number of principle components that matters. I agree with you that perhaps 7-8 DOF/parameters in the model is enough and beyond that it gets iffy.
I like the rolling window training concept -- i.e., using only the recent data points for training. There is a trade-off of course between not having enough data (due to existence of large "tail" effects), vs fitting or over-fitting to irrelevant data from the distant past.
I like to come up with trading rules that have some kind of behaviorial or fundamental mechanism or rationale behind them. By doing so, I think the chance of over-fitting becomes less because you are not just fitting your model against any pattern that happens to come up. Also by doing so, it is more likely that your model will remain relatively simple, and the number of parameters will naturally be smaller.