Frequency of Re-optimization

bwolinsky · Aug 8, 2013

Quote from Sergio77:

Slope based on what scale? By zooming on the chart the slope changes.

No, it doesn't. It's a mathematical calculation that won't change:

Linregslope

sidm · Aug 8, 2013

Quote from jack hershey:

Strike 1. Data for monitoring and analysis is limited to the width of the window in which each trend resides.

Anyone who violates this principle is fucked.

strike 2. another aspect of science and mathematics that is important is that there are independent and dependent variables. by four grade most people learn the the behavior of the dependent variable is a function of the independent variable.

strike 3. In any seeking or problem solving adventure, you have to use the correct kinds of math to solve the problem (or even see the opportunity.

re-read the thread to see who is striking out. Batting .300 for a potential trader is striking out. trading is not baseball or base running.

I don't really see how the above points are connected to the question I asked, but I will bite anyway. So here are my thoughts.

Data used for analysis/optimization should obviously be limited what is observable in the PAST, since that is all we can see really. That is not the issue being discussed here though. Let us assume for the sake of discussion that none of us make the "duh" mistake of using future data points to judge the present trend while optimizing.

As far as independent/dependent variables go, presence of a trend implies that there is some sort of auto-correlation in the prices. Hence tomorrow's price is the dependent variable and prices in the past become the independent variable.

The question of "appropriate math" is interesting. Here is my take.

(1) The standard mathematical models based on Normal or similar distributions don't work.
(2) Since there is so much randomness and noise in the markets, any strategy must inherently be simple. Complexity makes systems fragile and susceptible to small movements. Hence my aversion to complicated indicators and attraction to simple heuristics.
(3) While probability distributions don't work, there is one concept that has been shown to hold much promise in empirical research: Fractals. Due to fractal nature of the prices, no matter what time scale you zoom into, the basic pattern doesn't change. This tells me that money can be made at all time scales (seconds, minutes, hours, days, weeks, months).

Quite a digression from the original question of re-optimization, but interesting nevertheless.

sidm · Aug 8, 2013

Quote from kut2k2:

You ask an excellent question.

Off hand I would say you find a backtest length, say N, that give you a satisfactory performance and then reoptimize every N/2 data points after that for N data points.

So the optimization would be from point 1 to point N, the first reoptimization would be from point (N/2 + 1) to point 3N/2, the second reoptimization would be from point N+1 to point 2N, and so on.

There's probably a better way to do it and I hope some other member will post it.

Interesting. Any particular reason why you choose interval of N/2?

jack hershey · Aug 9, 2013

Quote from sidm:

I don't really see how the above points are connected to the question I asked, but I will bite anyway. So here are my thoughts.

Data used for analysis/optimization should obviously be limited what is observable in the PAST, since that is all we can see really. That is not the issue being discussed here though. Let us assume for the sake of discussion that none of us make the "duh" mistake of using future data points to judge the present trend while optimizing.

As far as independent/dependent variables go, presence of a trend implies that there is some sort of auto-correlation in the prices. Hence tomorrow's price is the dependent variable and prices in the past become the independent variable.

As you study math and markets you will find price is ALWAYS the dependent variable.

The question of "appropriate math" is interesting. Here is my take.

(1) The standard mathematical models based on Normal or similar distributions don't work.
(2) Since there is so much randomness and noise in the markets, any strategy must inherently be simple. Complexity makes systems fragile and susceptible to small movements. Hence my aversion to complicated indicators and attraction to simple heuristics.
(3) While probability distributions don't work, there is one concept that has been shown to hold much promise in empirical research: Fractals. Due to fractal nature of the prices, no matter what time scale you zoom into, the basic pattern doesn't change. This tells me that money can be made at all time scales (seconds, minutes, hours, days, weeks, months, etc..

I empathize with you. Market variables are granular and only Boolean algebras works for defining the system of the market's operation. Apply algebra to the interlocking fractals of the market.

Quite a digression from the original question of re-optimization, but interesting nevertheless.

kut2k2 · Aug 9, 2013

Quote from sidm:

Interesting. Any particular reason why you choose interval of N/2?

That would be the longest interval. You may be better off using something shorter.

bwolinsky · Aug 9, 2013

Quote from kut2k2:

That would be the longest interval. You may be better off using something shorter.

Like (n-1)/2? would this be your sample vs population insight?

kut2k2 · Aug 9, 2013

Quote from bwolinsky:

Like (n-1)/2? would this be your sample vs population insight?

Would this be you showing childish pique over not finding a <s>sucker</s>buyer for your $8.5M <s>boondoggle</s>trading system?

bwolinsky · Aug 9, 2013

Quote from kut2k2:

Would this be you showing childish pique over not finding a <s>sucker</s>buyer for your $8.5M <s>boondoggle</s>trading system?

I could only make $500 a month on 2 NQ $12 k accounts forever and unlimited aum fees on Covestor as a top ten outperformer of the s&p in the last 365 days by 2,840 basis points.

Right now it would be third in the World Cup Championship of futures trading and pretty close to making my family worth another comma.

Sergio77 · Aug 10, 2013

Quote from bwolinsky:

No, it doesn't. It's a mathematical calculation that won't change:

Linregslope

You mean that slope is an invariant?

ByronAYoung · Feb 23, 2014

I like this question, I've been running my optimization constantly (even now), but I've been seeing a lot of over-specialization. But at the same time, I feel that optimizing less often wouldn't eliminate the problem, it would just over-specialize for a different time range. kut2k2's suggestion seems reasonable, but when using large data sets (I'm using three years), it would be hard to justify optimizing every 1.5 years.

I've found the initial conditions of the scheme can change the behavior dramatically, so perhaps optimizing for a variety of start dates, initial investments, global stock market environments... might remove the tendency of your optimization to over fit the data.

I've decided that instead of optimizing for a 3 year data set, I split the data up into 6 intervals that are run with the same initial investment. That way if any over-fitting does occur it'll hopefully be averaged out by the other intervals. You could also potentially ignore the best and worst performing individuals.

Another potential workaround is to ignore the tweaking done as the optimizer over fits data. When my optimizer over fits, it tends to simply tweak an algorithm's parameters to buy a stock before an increase, as opposed to creating an entirely different set of parameters. If you run the optimizer constantly, the algorithm parameters you utilize in practice could be the average parameters of every day's optimum parameters for the previous month, for example.

Anyway those are a few ideas I'm working on. Not sure I'm explaining this well though.