Lightweight threads

traderslair · Apr 29, 2012

I code multithread in C++, I would strongly disagree your statement that my code would have problems catching up Erlang code.

Joel Reymont · Apr 29, 2012

I did not say anything about C++, did I?

2rosy · May 2, 2012

Quote from nitro:

I have an arbitrage strategy that I am implementing that requires a humongous number of cores/threads to implement.

anytime i see something like this i know you're on the wrong track.

bwolinsky · May 5, 2012

Quote from nitro:

I have an arbitrage strategy that I am implementing that requires a humongous number of cores/threads to implement. The main problem is, that the languages that I am competent in, threads have all sorts of problems, and in .Net they are hardly lightweight.

This led me to this library:

http://blogs.msdn.com/b/daniwang/archive/2008/12/29/lightweight-threading-using-c-iterators.aspx

It is interesting that C# scales almost as well as Erlang on this test. Funny how C++/C# would crush Erlang on most computational tests, but when it comes to large number of threads, other languages have a hard time keeping pace with Erlang.

It is interesting, this pattern, because most arbitrage type trading requires you to scan large ammounts of data for opportunity. So say you are looking at an option chain, and need to compare that chain against other option chain, where any two options from each chain could be an opportunity. You can begin to see that even for a small number of strikes/months, the combinatorial explosion of computations would overwhelm most computers even with lots of cores.

I always look to simpler domains where it is both fun and educational, and mirrors the problem in the more complex domain. I realized that Conway's Game of Life has many features that, adding the constraint that it must be as fast as possible, that this problem shares many of the techniques that would be useful for this type of high frequency trading.

So, I pose a challenge. Post your best program that uses Erlang to compute the Game of Life. Note that a non-threaded version is not interesting!

You're better off using Multicharts 64 bit that's already programmed for C++ for multi thread and multi-core processing.

Stoxtrader · May 5, 2012

Quote from nitro:

I have an arbitrage strategy that I am implementing that requires a humongous number of cores/threads to implement.

Quote from Kevin Schmit:

Nitro, this statement is almost certainly not true.

Before you you run out and buy a machine with a "humongous number of cores" (or rent them on the cloud) and try to run 10,000+ simultaneous threads, I suggest you try redesigning your code. That has got to be an easier solution.

More...

Um. Yes. More details are needed here. Why does the strategy require a large number of cores/threads? What are the time constraints? What hardware is this strategy being run on, and what are the bottlenecks? For example, if low latency is a requirement then single threading may very well be the optimal solution. Threading or pushing out to the cloud definitely has overhead.

amazingIndustry · May 6, 2012

you are light years behind. Here is a list of keywords you want to look up when trying to get into the .Net concurrent and multi-threaded space:

Task Parallel Library (TPL)
Async CTP
TPL Dataflow

Btw, Erlang only makes sense when targeting projects that deal with massive amounts of threads, and especially need to handle state machines well, it was originally designed in and for the telecom industry. I strongly recommend to not throw computationally intensive stuff at it.

Quote from nitro:

BTW, this seems to be the way to do lightweight threads within a Microsoft solution:

http://msdn.microsoft.com/en-gb/magazine/cc163556.aspx
http://msdn.microsoft.com/en-us/library/bb648752.aspx

amazingIndustry · May 6, 2012

you can't give up pushing Assembler, huh? I disagree with your religious believe that you can throw Assembler at anything there is. I would challenge you to code a mergeSort that is much faster in Assember than C++ and which is in turn much faster than in C#. Today's compilers get more and more efficient that a simple List.Sort() in C# .NET does not fall by much behind the fastest algorithm you could code up in C++ (to be fair, competing algorithms in C++ and Assembler would also need to be coded to target single threaded models, or there should be a fair comparison re parallel code).

Look up related posts at Stackoverflow, there are guys who tried all that and posted performance measurements.

My point is that while you try to solve stuff in Assembler to get your 10% or 15% faster "edge" I code up the same stuff in C# 5-10 times faster and run circles around you while you still code.

Please take it for what it is, I try to make a point, its not a pissing contest and I am not even addressing you directly but those who believe a low-level language is in all cases the best solution. This was maybe the case 10-15 years ago but surely not anymore today.

Heck, I can code up a sorting algorithm in C# and if you allowed me access to Cuda through a library then I would beat any of your pure Assembler routines hands down. My point is that you can do so much in C# and other higher level languages today that Assembler's advantages are all but gone.

P.S.: I am aware that some very low level, near-hardware stuff was coded up originally in Assembler. Its great and I welcome it, but if you are agitated by now then you probably did not get my point which was that I can access all those libraries out of C# (well, most of them) and do not have to limit myself to crude, command line type of development instruments but can access state of the art development tools, plus cross language libraries and bindings on top of all the goodies that .Net 4.0 and the upcoming new version provides.

Quote from braincell:

I'm not sure about the depth of the analysis you're talking about but did you consider an event driven system? Each will fire it's own thread when needed and terminate with the end of analysis. This can be on a per tick basis for example, or per bid/ask update. If the computation takes less time than each tick (and it should if i understand what you're doing) then you don't have to have threads running at all times. If the computation takes a bit too long, you might need to consider optimizing it by reverting to simpler data structures like 1d arrays etc which are faster than e.g. lists and so on.

Other than that, you can simply do distributed computing where you network multiple CPUs to compute data coming from the central server. Then you don't need a CPU with multiple cores, just any kind of solid data center with a bunch of them networked together. I did this for a similar task.

Finally, the fastest multithreading is done with Assembly language, so consider learning it. I could answer your challenge in ASM but i'm coding too much anyway.

amazingIndustry · May 6, 2012

I do not fully agree with you. Look at Windows, how many tasks does it run at any given point in time? It could be hundreds. It all comes down at design time to be aware of the trade-offs one has to make, task switching overhead vs. benefits from running operations in segregated tasks.

I agree with you though that a single application that runs on too many tasks probably can be optimized and there is most likely something wrong.

Quote from NetTecture:

Actually no, you still waste. I would basically in C# use a custom TaskScheduler for X threads as above and schedule tasks. See, they do NOT happen at the same time - a CPU core can only execute one hardware thread at the same time ANYWAY. That is 1 thread per core on AMD, 2 on Intel with Hyperthreading. Called hardware. Having less threads in this case is less task switching, on top of a lot less wasted memory. Everyone using 800 threads for something like option chain calculation... better des not claim to be more than a junior developer. Waste of resources and time.

rufus_4000 · May 6, 2012

Ok, I will bite. Those who know me knows that I am not a .NET person by any means. But for a large number of concurrent lightweight threads (with the computing threads and I guess the arbitrage opportunity seeking threads), I would think some language constructs like "coroutines" maybe close to the desired goal. I am currently using a customized version of coroutines (libpcl with stack jumps for anyone that cares) along with my thread pool library in production platform, and it has yielded significant performance improvements (over simple "green-ish" threads). I believe MSFT calls coroutines "Fibers" under the .NET, but that's about the extent of my knowledge there.

For those of you who are not familiar with coroutines (other than the original Don Knuth's definition), lookup the Wikipedia entry. I think of it as a concurrency semantics free (as concurrency is implied in the "Yield" operation) cooperative execution, which, incidentally (!), works quite well for data processing, analytics generating, etc.

amazingIndustry · May 6, 2012

Re .Net there are plenty public sites out there that published performance results that show that task switching and generally running light-weight work on tasks rather than generating threads is always advantageous in terms of resource allocation and computational efficiency. I am not sure about the term "fibers" and thus cannot comment but my understanding of "coroutines" is a concept that was implemented through the arrival of task programming (at least in .Net). Note that most coroutine libraries date back to 2002-2003 when .Net was at 2.0.

It is very easy for anyone to run a performance comparison by starting up n-number threads and have them operate as worker threads and then trying to solve the same problem using tasks. The new Async CTP framework is terrific because it is very logical and the beauty of it is that it requires minimal code changes though I always recommend to think parallelism/concurrency at design time, its not optimal to try to squeeze sequential code into a concurrent framework through code changes.

Quote from rufus_4000:

Ok, I will bite. Those who know me knows that I am not a .NET person by any means. But for a large number of concurrent lightweight threads (with the computing threads and I guess the arbitrage opportunity seeking threads), I would think some language constructs like "coroutines" maybe close to the desired goal. I am currently using a customized version of coroutines (libpcl with stack jumps for anyone that cares) along with my thread pool library in production platform, and it has yielded significant performance improvements (over simple "green-ish" threads). I believe MSFT calls coroutines "Fibers" under the .NET, but that's about the extent of my knowledge there.

For those of you who are not familiar with coroutines (other than the original Don Knuth's definition), lookup the Wikipedia entry. I think of it as a concurrency semantics free (as concurrency is implied in the "Yield" operation) cooperative execution, which, incidentally (!), works quite well for data processing, analytics generating, etc.