Quote from mind:
extensive testing will inevitably subject results to curve fitting. the problem of curve fitting is not merely addressable within code, it must include the human being doing the tests as well.
example: consider you test a system with merely one indicator (assume there is a system with a single variable for simplicity's sake). you decide you run 1ooo variations of the indicator. assume your output criteria is sharpe ratio. assume your minimum value for that criteria is 1. so you are running simulations of the single indicator system with 1ooo variations and you are looking for an equity curve with a sharpe above 1.
in order not to curve fit you do the same thing with random data. thus you create random equity curves and you find out that let's say three out of one hundred have a sharpe ratio of above 1.
now let us say you run your 1ooo sims and 3o of them have a sharpe above 1. you throw them away since this is perfectly in line with pure randomness. with this number of tests you expect this number of outputs without them having any value tradingwise.
now here comes the thing.
most traders move on to the next indicator and start all over again. finding now lets 27 sharpes out of 1ooo tests. again throwing results away.
assume the whole thing is repeated ... a thousand times. you see where i am heading. the outcome of 3o sharpes above 1 is as well subject to randomness. if you do the whole thing often enough you will find a pure random set that has ... 6o sharpes above and ist still ... purely random.
problem is that these 6o might as well be valid, but it is rather unlikely in the context of these 1ooo different indicators. (btw 1ooo different indicators might sound much, but use the word "systems" instead and assume you are using three indicators for each one. now you only need ten indicators to get to one thousand different systems.)
then what can you do? IMO this is the crucial point of modern technical trading.
1. limit your tests. only start testing if you already have an indication that this is a good fertile ground. acrary used correlations to do so. or you use your trader know how. whatever. be careful not to run a machine doing this search for you ... or you start with a possible fit in the first place.
2. use in and out of sample tests to detect the flaws of your avoidCurveFitting method. be careful not to do this too often, otherwise your in and out of sample begin to merge.
3. get tougher with the criteria with each new run. this is not easily done, though it might sound convincing. it depends IMO how homogenous or diversified your different tested systems are. but this is becoming philosophical.
4. test, once you found something, on other markets. if it does well there too ... good indication it was not a fit. but remember not to do this too often. testing it against 1ooo other time series ...
final note.
acrary got famous on the board for his "edge test", which is nothing but a "curve fit test", if you will. i fully trust that alan was the highly successful trader he claimed to be. yet my guess is, it was not the edge test that made him successful, but the combination of him rigorously studying the market quantitatively, statistical tests and his personal judgement of the markets and their behavior (you might call it hindsight).
summing up
the problem of curve fitting IMO is that metalevels must not become optimization levels. consider you are smart and build a system that optimizes itself in a walk forward process. smart move. consider it works immediately. deal done. trade. but if it works only after 1ooo different combinations of the walk forward variables ... your meta level, namely the walk forward process, designed to avoid fitting, shifted on the optimization surface.
whatever ...