C

vicirek · Apr 4, 2013

Quote from gip3:

You aren't a friendly guy are you?

What does this have to do with parallel factorization algorithms and their relatively efficiency compared to CPU-only implementations? It does appear you like to throw jargon around, however irrelevant.

Okay - now we are getting to what I actually asked. And there are indeed plenty of papers on this subject. I take it you haven't read them? In that case, your contribution of "you can go google that" is noted and thanked.

I'm going to to go read those papers now.

Good luck. I am completly disinterested in matrices/factorization etc and it is not area I am interested in other than knowing how I would approach it if needed. And for that I need technical detail of the hardware which directly translates into how you write your software.

Do not be scared with technicalities. I was under the impression that it cannot be stated simpler on ET board than this and regardless what is the algorithm GPU programming is actually hardware you are programming to which is different than CPU jargon. I was just making the point that it can be done since you did not state size of your large matrix.

gip3 · Apr 4, 2013

Quote from vicirek:

Do not be scared with technicalities. I was under the impression that it cannot be stated simpler on ET board than this and regardless what is the algorithm GPU programming is actually hardware you are programming to which is different than CPU jargon. I was just making the point that it can be done since you did not state size of your large matrix.

You understand it's not a question of technicality right? It's possible (turns out not to be the case, of course) that matrix factorization simply cannot be made parallel efficiently. If that's the case (again, turns out not to be the case), hardware doesn't change a single thing.

To summarize: I asked a question about algorithm. You threw a condescending response back regarding hardware details.

vicirek · Apr 4, 2013

Quote from gip3:

To summarize: I asked a question about algorithm. You threw a condescending response back regarding hardware details.

This is ET general board by the way.

I think that you know the answer and try to get answer to a question that is at the core of parallel programming: what is trivial to do in sequential programming becomes major issue if you try to implement highly parallel algolrithm to a problem that is not inherently parallel. You can still take advantage of parallelizm in those situations as well.

So what you do? Exactly what programming and algorithms suppose to do: you split your problem into pieces that can be run parallel and those that can not. Specific implementation is dictated by your hardware/software and often computation is hybrid and asynchronous on both CPU/GPU.

As I said I can not help you with this particular problem because I do not have any applications that would utilize this.

hftvol · Apr 4, 2013

very true about the last statement in particular. I see myself having to acquaint myself with Node.JS, java script in general, html5 just to get couple nice looking chart scripts running that are based on D3 for implementation on my website (I make performance numbers accessible to registered clients and they can chart and calculate risk and return metrics any way they want over any chosen time frame. Such libraries are only available in JS/HTML5 and they are so easy to bind to that it would be a waste to spend hundreds of dollars to get a Silverlight library and having to delve into the intrinsics.

Quote from vicirek:

This is what I meant. I am using .Net extensively for the same reasons including C#. The only advice I have to people who want to be proficient programmers is to get some exposure to C++ to know where things came from. And there is added benefit of being able to implement some procedures using GPU or C++ AMP for some specific needs. Another important thing to remember is that programming requires to be multilingual because of constantly changing technology.

hftvol · Apr 4, 2013

some implementations to program FPGAs use C variations but they all need to be compiled to HDL in the end. Still lots of guys feel more comfortable writing such libraries in C or C++ rather than Verilog or VHDL. Just pointing out one application where C is still used.

Quote from misaki:

I can understand if this was C# vs Java, but why are we having a language war between C++ and C#?

There are several nice features about C++, e.g. resource management vs C# IDisposable, templates vs C# generics, compiler support and platform neutrality; almost everything has C++ bindings, much cleaner way to access a wide range of tools as opposed to .NET interoperability and C# marshaling. vicirek's point has some credit - it boils down to the fact that 'faster' aspects of the .NET framework class library simply mean better, more heavily-optimized implementations than the C++ standard library (e.g. System.IO).

Then I'd much rather build GUIs with C# than C++. C# .NET development is a much more centralized experience, since there's one really, really good IDE and a few pariah IDEs mostly for Mono development.

Finally, there's always variability in user experiences - I can probably write a production implementation in Fortran 90 faster than C#, but that doesn't mean anything about the two languages besides give possible hints of my age and working background.

Moral of the story: (1) C# vs C++ for your financial applications is usually a business cost-benefit decision; (2) you generally speak of faster implementations, not faster languages. (3) Good programmers are usually indifferent to the language choice, as I've said in page 4. So if you need to ask what language you are using, or find yourself participating in language wars, chances are that you're better off using the one you're most familiar with.

Now can we go back on topic and I hear more about what you guys are doing with C?

hftvol · Apr 4, 2013

GPUs are heavily used on the buy side, Risk Management (valuation), derivatives pricing, to name just two. When I say heavily I really mean that, you won't find a single large fund that trades exotics that would not peruse GPUs.

And of course can matrix manipulations be parallelized, no question. That is why matrix operations is one of the domains of GPU outsourcing.

Quote from misaki:

I see. I don't really buy into the GPGPU paradigm. I could see uses for it in sell-side pricing, but I haven't found use for it in buy-side trading. If I really grasp for ideas, I could think of a few large matrix operations but I am guessing off the tip of my fingers that there are faster (and more portable) CPU implementations than their GPGPU counterparts after bus overhead. Is there any particular use for it that you can shed light on?

misaki · Apr 5, 2013

Quote from hftvol:

GPUs are heavily used on the buy side, Risk Management (valuation), derivatives pricing, to name just two. When I say heavily I really mean that, you won't find a single large fund that trades exotics that would not peruse GPUs.

And of course can matrix manipulations be parallelized, no question. That is why matrix operations is one of the domains of GPU outsourcing.

Thanks. Yes, I'm aware of those cases, except I generally classify valuation and derivatives pricing as sell-side activity. I understand there are overlaps that might be firm-specific. I've tried eliminating buffer copy from CPU to GPU - but as far as my benchmarks go, it is still too slow except for a few use cases, e.g. heavy PDEs.

bln · Apr 5, 2013

Quote from 2rosy:

why would anyone use C to download lists, or do historical data work? python, perl, ruby etc is the right tool for the job in those situations.

Well, because C is a fast and resource efficient language compared to interpreted languages like Ruby, Python, Java, C#, etc. Python and Ruby runs around 5-10 times slower than C, that do mean longer delays for the end result.

If the main code is written in C then you may also want the support code around core it to be written in it C too. Modern C compilers is very good at optimization and can auto-vectorize your code to some degree, and automatically use all these nice instructions you have access to, SSE, AVX, AVX2, XOP, FMA4, etc.

stevegee58 · Apr 5, 2013

Quote from hftvol:GPUs are heavily used on the buy side, Risk Management (valuation), derivatives pricing, to name just two.

I've seen open source OpenCL code for Black Scholes plus Quantlib seems to have OpenCL extenstions as well.

There are a number of financial algorithms that lend themselves to GPU acceleration.

vicirek · Apr 5, 2013

Quote from stevegee58:

There are a number of financial algorithms that lend themselves to GPU acceleration.

There are almost no real life algorithms that would be immediately 100% parallel ready. Most of the programming effort is made to convert such algorithm to take maximum possible advantage of parallelism either through hardware or software acceleration. Usually there are some chunks of code that cannot be processed in parallel but there are ways to synchronize those operations with some degree of performance penalty but the end result is significant speedup of entire application.

C

Guest