Hello, I am currently doing research into generating profitable strategies using an ML system. I was hoping to start a discussion here on ET to help advance this idea. Research into directional trading is something fairly new to me, so the insights from traders who currently trade profitable directional strategies would be valuable while refining this idea.
The basic idea behind the system is that I provide the algorithm which 'a grammar' which it can use to describe trading strategies. The system then begins to generate strategies and tests them against 'a standard' which I have giving it to be able to identify good strategies from bad strategies.
There are three main hurdles to this approach I believe:
1) The grammar needs to be powerful enough such that it is able to at least describe a profitable directional strategy which is being successfully traded by system traders today (if they exist). The grammar also needs to be small enough since the size of the solution space increases exponentially with the addition to the grammar.
2) The standard needs to be strict enough to be able to weed out most over fit strategies without reducing the probability of finding a strategy which meets the standard to 0%.
3) I am unfamiliar with the temporal considerations of directional systems. Does it make sense that a single system can be consistently profitable over a significant period of time (years)? Or does the market change in unexpected ways.
Another question up for debate is whether or not to generate purely long trading strategies which I am currently doing now to try to simplify the strategies generated. Or generate strategies which trade both long and short, which might make more sense. It seems dubious to me that a single long strategy can remain consistently profitable over many years with drastic changes in market conditions.
Here is a little more information about my current test setup:
The Grammar: I have define a basic set of maths which the algorithm can use to describe a strategy. It is important to note that the language which is generated by this grammar is recursive to a limit. A simple example of part of a strategy which squares the change in a price over the past 10 min would be
DATA X = Current ES Close Price
DATA Y = Sample ES Close Price, Offset 10min, Window Length 1 min
DATA Z = FEED X - FEED Y
DATA W = FEED Z * FEED Z
currently included:
Market Data, Open High Low Close Volume
Static Numbers
Maximum/Minimum over a window of time
Sample Average over a window of time
Absolute value
Addition, Subtraction, Multiplication, Division
currently excluded:
Least Squares Slope
Log(x), Exp(x), Pow(x,y)
Max, Min between two pieces of data
Current PL, uPL
Current Time of day in seconds
If statements
Variable Numbers which can be set/incremented/modified based on If statement triggers (to count peaks etc)
opening positions:
A data source range is within or outside (a,b)
closing positions:
Time limit
Take profit
Stop loss
Trail stop
A data source range is within or outside (a,b)
I am currently limiting the system to creating between 1-30 different data sources based on the above, and using between 1-5 opening and closing conditions per. Does anyone have any opinions on my current grammar? Is there operators not there which should be or vise versa? I would love to hear from someone who runs a successful system to see if this grammar could actually be used to describe their's (a simple yes or no answer is fine).
The Standard: The standard is used to quickly weed out useless strategies and to help prune over fit strategies from reaching the end of the data. The standard is also in place to help choose the type of strategy generated. This way I can ask myself, if I generate a strategy which satisfies A,B,C standards over 4 years of back data, performs similarly in forward testing, and has a reasonable equity curve, would I have comfortable going live with it? Current test setup is
Max Draw Down < 500$
Given a sliding period of the past 7 days
Min Net Profit over period > 1$
Min # Trades over period > 10
Min Win Ratio over period > 0.6
Having no background in directional trading I have no idea how realistic this standard is. It is completely possible that it is far too optimistic and regardless of how powerful the grammar is there doesn't exist a single strategy which can meet this standard over the whole 4 years worth of data. Any thoughts?
Data & Strategy Recording: Right now a strategy is only recorded for future analysis if it is either the strategy which has proceeded the furthest into the data set of 4 years. Or it has reach the end of the data set. I am currently testing the learning system on 2004-2007 ES OHLC + Volume 1 min bars.
Thanks in advance to anyone who joins in on this discussion. My main concerns right now are if my standard is realistic, if my grammar is powerful enough to describe existing profitable strategies and the time required to process enough strategies to find a 'good' one. Feel free to ask any questions if my above descriptions were unclear or confusing.
The basic idea behind the system is that I provide the algorithm which 'a grammar' which it can use to describe trading strategies. The system then begins to generate strategies and tests them against 'a standard' which I have giving it to be able to identify good strategies from bad strategies.
There are three main hurdles to this approach I believe:
1) The grammar needs to be powerful enough such that it is able to at least describe a profitable directional strategy which is being successfully traded by system traders today (if they exist). The grammar also needs to be small enough since the size of the solution space increases exponentially with the addition to the grammar.
2) The standard needs to be strict enough to be able to weed out most over fit strategies without reducing the probability of finding a strategy which meets the standard to 0%.
3) I am unfamiliar with the temporal considerations of directional systems. Does it make sense that a single system can be consistently profitable over a significant period of time (years)? Or does the market change in unexpected ways.
Another question up for debate is whether or not to generate purely long trading strategies which I am currently doing now to try to simplify the strategies generated. Or generate strategies which trade both long and short, which might make more sense. It seems dubious to me that a single long strategy can remain consistently profitable over many years with drastic changes in market conditions.
Here is a little more information about my current test setup:
The Grammar: I have define a basic set of maths which the algorithm can use to describe a strategy. It is important to note that the language which is generated by this grammar is recursive to a limit. A simple example of part of a strategy which squares the change in a price over the past 10 min would be
DATA X = Current ES Close Price
DATA Y = Sample ES Close Price, Offset 10min, Window Length 1 min
DATA Z = FEED X - FEED Y
DATA W = FEED Z * FEED Z
currently included:
Market Data, Open High Low Close Volume
Static Numbers
Maximum/Minimum over a window of time
Sample Average over a window of time
Absolute value
Addition, Subtraction, Multiplication, Division
currently excluded:
Least Squares Slope
Log(x), Exp(x), Pow(x,y)
Max, Min between two pieces of data
Current PL, uPL
Current Time of day in seconds
If statements
Variable Numbers which can be set/incremented/modified based on If statement triggers (to count peaks etc)
opening positions:
A data source range is within or outside (a,b)
closing positions:
Time limit
Take profit
Stop loss
Trail stop
A data source range is within or outside (a,b)
I am currently limiting the system to creating between 1-30 different data sources based on the above, and using between 1-5 opening and closing conditions per. Does anyone have any opinions on my current grammar? Is there operators not there which should be or vise versa? I would love to hear from someone who runs a successful system to see if this grammar could actually be used to describe their's (a simple yes or no answer is fine).
The Standard: The standard is used to quickly weed out useless strategies and to help prune over fit strategies from reaching the end of the data. The standard is also in place to help choose the type of strategy generated. This way I can ask myself, if I generate a strategy which satisfies A,B,C standards over 4 years of back data, performs similarly in forward testing, and has a reasonable equity curve, would I have comfortable going live with it? Current test setup is
Max Draw Down < 500$
Given a sliding period of the past 7 days
Min Net Profit over period > 1$
Min # Trades over period > 10
Min Win Ratio over period > 0.6
Having no background in directional trading I have no idea how realistic this standard is. It is completely possible that it is far too optimistic and regardless of how powerful the grammar is there doesn't exist a single strategy which can meet this standard over the whole 4 years worth of data. Any thoughts?
Data & Strategy Recording: Right now a strategy is only recorded for future analysis if it is either the strategy which has proceeded the furthest into the data set of 4 years. Or it has reach the end of the data set. I am currently testing the learning system on 2004-2007 ES OHLC + Volume 1 min bars.
Thanks in advance to anyone who joins in on this discussion. My main concerns right now are if my standard is realistic, if my grammar is powerful enough to describe existing profitable strategies and the time required to process enough strategies to find a 'good' one. Feel free to ask any questions if my above descriptions were unclear or confusing.
I am worried that my approach may be too broad and will not produce a result in a reasonable time. A more honed approach focusing on correlations like you described might be a better use of my time. I'd still like to discuss the intricacies of the general approach as a means of increasing my understanding of the problem that is 'directional system trading using technical indicators'. Ie. what maths are required to describe a successful strategy, what is a reasonable expectation to put on the performance of a successful strategy, can a system which works between 2004-2007 work in 2008?