Fast Brute-Force Optimization Software

quantcode · Nov 27, 2017

nonlinear5 said:
I use my own optimization software, capable of processing about 500 million bars per second (on a 6-core Intel i7 processor machine).

500 million sounds very nice. The question is also, how much logic calculation is done per data point and that depends on the strategy individually.
I think to iterate simple Arrays (in C or C++) could be the fastest method. How do you do it?

nonlinear5 · Nov 27, 2017

quantcode said:
500 million sounds very nice. The question is also, how much logic calculation is done per data point and that depends on the strategy individually.
I think to iterate simple Arrays (in C or C++) could be the fastest method. How do you do it?

My optimization software is written in Java. It took me a number of years to refactor it in various ways to squeeze all that processing speed. There is nothing particularly magic about it. Just standard software engineering practices:

-- efficient use of data structures (maps, queues, sets, lists)
-- eliminating the processing bottlenecks (with the use of a CPU profiler)
-- engaging all CPU cores to the full capacity, with good use of multi-threading
-- ensuring that there is no disk I/O (beyond the initial loading of the data set)
-- making it compute-bound as much as possible, relative to memory-bound
-- caching everything that can be cached
-- eliminating the unnecessary repetitions
-- identifying what's computationally expensive, and refactoring it
-- simulating GPU on CPU (think of running a chuck of tasks all at once, rather than one at a time)
-- minimizing the memory foot print, the scope, and the immutability of objects

My data sets are huge (about 70 million bars per symbol), so even with that speed of 500 million bars per second, some optimizations run for hours.

In my typical trading strategy, there could be 5 parameters. Let's say we want to test the range of [1..10] for each parameter. This gives us 100K parameter permutations to back test. Each of this permutation has to be applied to the 70 million bars, so we have the total of:

100,000 * 70,000,000 = 7 trillion passes

With the speed of 500 million passes per second, it would take about 4 hours to complete the optimization. For some optimizations, I let them run for days.

So, there is a combinatorial explosion (i.e. "the curse of dimentiality") to fight, and the over-fitting effects to address. I have a number of techniques to deal with both. For the combinatorial explosion, it comes to the use of "smart" optimization techniques (as opposed to the brute-force optimization). For overfitting, it's about carefully choosing the cost functions (i.e. performance metrics), and performing the cluster analysis of the optimization space (looking for broad, sustained regions of elevated performance).

systemtrader.org · Nov 27, 2017

JBookTrader was cool and did most of these practices. Still using book imbalance for trading?

nonlinear5 said:
My optimization software is written in Java. It took me a number of years to refactor it in various ways to squeeze all that processing speed. There is nothing particularly magic about it. Just standard software engineering practices:

-- efficient use of data structures (maps, queues, sets, lists)
-- eliminating the processing bottlenecks (with the use of a CPU profiler)
-- engaging all CPU cores to the full capacity, with good use of multi-threading
-- ensuring that there is no disk I/O (beyond the initial loading of the data set)
-- making it compute-bound as much as possible, relative to memory-bound
-- caching everything that can be cached
-- eliminating the unnecessary repetitions

My data sets are huge (about 70 million bars per symbol), so even with that speed of 500 million bars per second, some optimizations run for hours.

In my typical trading strategy, there could be 5 parameters. Let's say we want to test the range of [1..10] for each of the parameter. This gives us 100K parameter permutations to back test. Each of this permutation has to be applied to the 70 million bars, so we have the total of:

100,000 * 70,000,000 = 7 trillion passes

With the speed of 500 million passes per second, it would take about 4 hours to complete the optimization. For some optimizations, I let them run for days.

So, there is a combinatorial explosion (i.e. "the curse of dimentiality") to fight, and the over-fitting effects to address. I have a number of techniques to deal with both. For the combinatorial explosion, it comes to the use of "smart" optimization techniques (as opposed to the brute-force optimization). For overfitting, it's carefully chosen cost functions, and the cluster analysis of the optimization space.

nonlinear5 · Nov 27, 2017

systemtrader.org said:
JBookTrader was cool and did most of these practices. Still using book imbalance for trading?

Thanks, ST. Yes, I am still using JBookTrader as my base code, and still using book imbalances for live trading.

truetype · Nov 27, 2017

nonlinear5 said:
My optimization software is written in Java. It took me a number of years to refactor it in various ways to squeeze all that processing speed.

With all that excellent backtesting, you must be a wealthy man by now!

nonlinear5 · Nov 27, 2017

truetype said:
With all that excellent backtesting, you must be a wealthy man by now!

I am in the "5% profitable" bucket, so I am better off than the proverbial 95% of the traders, but I am far from being wealthy.

After back-testing literally hundreds of million of strategies, I realized that the main benefit of backtesting/optimization is to find out what does not work, rather than what works.

ET180 · Nov 27, 2017

I still think the best approach is to learn how to trade first and then look for opportunities to automate and build a system once you have already found a successful strategy. That approach probably applies in any domain -- learn how to drive before trying to build a self-driving car.

I'm surprised that any retail trader can still find an edge by trading order book imbalances. I would expect retail traders to be unable to compete against HFT while facing a severe latency disadvantage.

Fast Brute-Force Optimization Software

quantcode

nonlinear5

systemtrader.org

nonlinear5

truetype

nonlinear5

ET180