NinjaTrader_Dierk
ET Sponsor
Hi,
I'm currently reading the book Katz/McCormick "The Encyclopedia of Trading Strategies" which brought me back
to one of my main concerns:
Trying to find a strongly systematic approach to trading I have to deal which adjusting strategies by
applying specific parameter sets. Selecting specific parameter sets may improve to strategy performance which
is called optimization. My main concern - as of many traders - is:
"How far do I have to optimize to make a strategy profitable, and when does over-optimization or
curve-fitting start?"
There are many approaches out there to solve this problem, but reading this book an idea came to my mind:
Katz/McCormick focused me again on the t-test.
Scenario:
The in-sample test of a strategy in this book resulted in
- #trades: 118
- Avg profit per trade: 740.9664
- StdDev of profit: 3811.355
(comment: I prefer using avgprofit as percent value, but that's personal style ...)
The StdError ("Expected SD of Mean") is 3811.355/sqrt(118) = 350.8637. Executing a t-test yields a probability of 1.84% that
the results are just by luck. This does sound really good, but has to be adjusted by the number of optimization steps
which have to be performed to achieve this result. The results in this examples have been achieved by executing 20
optimization steps.
The statistical significance is adjusted: 1 - power(1-0.00184, 20) = 0.3103 = 31.03%
(Detail: Can anybody explain the concept between the last step? It's somewhat unclear to me.)
This number is far worse than 1,84%. Anyhow: The book states, that due to serial dependencies, the real error may be less.
So far, all this stuff can be found in this book on page 63ff, unfortunately I didn't give birth to those ideas
.
Now my idea:
Improving statistical significance could be target of an optimization process. I could make sense to not only focus on
good past performance but also (or only!!) look for a maximum statistical significance. This might be a way to overcome
(or make at least less hurting) the problem of over-optimization and curve-fitting.
Is there any edge in it? Has anybody tried to evaluate past performance based on this kind of t-test and tried to execute the results on unseen data ?
Has anybody executed (and probably executed) optimizers based on this principle ?
Any comments are highly appreciated.
Dierk Droth
I'm currently reading the book Katz/McCormick "The Encyclopedia of Trading Strategies" which brought me back
to one of my main concerns:
Trying to find a strongly systematic approach to trading I have to deal which adjusting strategies by
applying specific parameter sets. Selecting specific parameter sets may improve to strategy performance which
is called optimization. My main concern - as of many traders - is:
"How far do I have to optimize to make a strategy profitable, and when does over-optimization or
curve-fitting start?"
There are many approaches out there to solve this problem, but reading this book an idea came to my mind:
Katz/McCormick focused me again on the t-test.
Scenario:
The in-sample test of a strategy in this book resulted in
- #trades: 118
- Avg profit per trade: 740.9664
- StdDev of profit: 3811.355
(comment: I prefer using avgprofit as percent value, but that's personal style ...)
The StdError ("Expected SD of Mean") is 3811.355/sqrt(118) = 350.8637. Executing a t-test yields a probability of 1.84% that
the results are just by luck. This does sound really good, but has to be adjusted by the number of optimization steps
which have to be performed to achieve this result. The results in this examples have been achieved by executing 20
optimization steps.
The statistical significance is adjusted: 1 - power(1-0.00184, 20) = 0.3103 = 31.03%
(Detail: Can anybody explain the concept between the last step? It's somewhat unclear to me.)
This number is far worse than 1,84%. Anyhow: The book states, that due to serial dependencies, the real error may be less.
So far, all this stuff can be found in this book on page 63ff, unfortunately I didn't give birth to those ideas
.Now my idea:
Improving statistical significance could be target of an optimization process. I could make sense to not only focus on
good past performance but also (or only!!) look for a maximum statistical significance. This might be a way to overcome
(or make at least less hurting) the problem of over-optimization and curve-fitting.
Is there any edge in it? Has anybody tried to evaluate past performance based on this kind of t-test and tried to execute the results on unseen data ?
Has anybody executed (and probably executed) optimizers based on this principle ?
Any comments are highly appreciated.
Dierk Droth