Curve-fitting or just prudent trade selection?

ssrrkk · Nov 28, 2011

Quote from Wide Tailz:

That's a very interesting point. Great observation.

I have a type 1 sytem that was trained on about 22000 data points and has 5 degrees of freedom.

I changed some of the parameters and got a much better fit for the most recent 1700 data points, but the system was flat from any point before that. I debated on just refitting the system every n datapoints but threw the idea away as being too delicate for the real world.

"Ultimately discretionary"

Won't know if it's the right decision until the next 1700 data points are captured.

Regarding the data-to-parameter ratio: I think since financial data contain a large amount of correlation, N data points are not really "worth" N data points -- as you point out, I think it's the independent degrees of freedom or number of principle components that matters. I agree with you that perhaps 7-8 DOF/parameters in the model is enough and beyond that it gets iffy.

I like the rolling window training concept -- i.e., using only the recent data points for training. There is a trade-off of course between not having enough data (due to existence of large "tail" effects), vs fitting or over-fitting to irrelevant data from the distant past.

I like to come up with trading rules that have some kind of behaviorial or fundamental mechanism or rationale behind them. By doing so, I think the chance of over-fitting becomes less because you are not just fitting your model against any pattern that happens to come up. Also by doing so, it is more likely that your model will remain relatively simple, and the number of parameters will naturally be smaller.

abattia · Nov 30, 2011

Quote from ronblack:
...This blog talks about different types of curve-fitting and their impact on trading system performance...

Thanks for this link!

IMO, key to trading system design is as deep an understanding as possible of the "market mechanics" (or "market microstructure") underlying that price behaviour which makes the system succeed (or fail).

With this understanding as a guide, my view is that "parameters" can be introduced as required AS LONG AS THEY MAKE SENSE IN TERMS OF THE UNDERLYING MARKET MECHANICS, and as long as they are not added just because the optimized result is better. This is how to avoid "curve fitting".

The other point (made in the above link) about the lack of robustness of Type I systems is excellent. Thanks for sharing!

logic_man · Dec 3, 2011

Quote from ronblack:

+

Some should read your post over and over again. Whether curve-fititng is good or bad depends on how one does it and what are the objectives, timeframes, markets, etc. The worse type of curve-fit is the one that moves signals and their exits on a price curve until the profit is acceptable. This must be avoided. This blog talks about different types of curve-fitting and their impact on trading system performance.

Interesting conceptual discussion at that blog post. Would have liked to see some data on distinguishing between the outcomes of his curve-fitting taxonomy, but it made intuitive sense.

logic_man · Dec 3, 2011

Quote from ssrrkk:

I like the rolling window training concept -- i.e., using only the recent data points for training. There is a trade-off of course between not having enough data (due to existence of large "tail" effects), vs fitting or over-fitting to irrelevant data from the distant past.

I like to come up with trading rules that have some kind of behaviorial or fundamental mechanism or rationale behind them. By doing so, I think the chance of over-fitting becomes less because you are not just fitting your model against any pattern that happens to come up. Also by doing so, it is more likely that your model will remain relatively simple, and the number of parameters will naturally be smaller.

I recently have run into the "window" issue regarding some quantitative relationships which retain their ordinal structure over time (X is always greater than Y) but not their cardinal relationship, since X and Y are not static, nor do they change at the same rate. It caused me to miss a very nice trade and now I am debating this very issue of how to integrate new data into the model. I think that I will go with all historical data to determine X and Y's relationship because my hypothesis is that over time the variations smooth out, precisely because X will always be greater than Y, so there is an upper and lower bound on the relationship.

In other scenarios where there isn't an upper and lower bound on the quantitative relationship, using a window which correlated to whatever market regimes you were trying to trade would probably work better.

Kind of thinking out loud here on this comment.

Also, agree 100% on the notion that your rules need to reflect something you can intuitively understand. When I explain my rules to someone, they immediately get why those rules work, even if they are not traders. My model has 5 easily-explained parameters, only 2 of which are really core, while the other 3 are optimizations that take the strategy from being a profit factor around 1.75 to a profit factor of over 3.5. I started with only the 2 and have since added the other 3 as time has gone on and I started trading the 2-factor strategy in real-time and saw outcomes repeat themselves over time based on the values of those parameters.

Curve-fitting or just prudent trade selection?

ssrrkk

abattia

logic_man

logic_man