You know linux allocator have a set limit of 64k i think before they go hit up.
By this, you mean that only 64k of data can be stored in cache?
You know linux allocator have a set limit of 64k i think before they go hit up.
By this, you mean that only 64k of data can be stored in cache?
IAS_LCC, you should really profile your application to pinpoint where the problem actually lies. I am pretty sure it has nothing to do with your cache. First of all, it might be useful to know whether issues arise at the point of data acquisition/loading or the streaming part and injection into strategies or elsewhere. Have you already been able to pinpoint the exact problem?
No, but i haven't put a lot of effort into it yet. Its low on my priority list right now as I'm more concerned with strategy development than software optimization. I know its related to getting the data from the feed handler to my "trading platform". I use shared memory to do this, so im fairly certain its a cache hit problem or the shared memory mutex is blocking the other thread more often than I'd like.
Hmm, unless you've developed a new OS and filesystem I've never heard about
then this I find hard to believe.
Anyone using the standard linux kernel fread() will not accomplish this.
Guys Listening to volpunter makes you think that retail automated trading is hopeless,
well it's not, I'm proof of it
![]()
fread() is not a Linux kernel, or even standard system call. It's a standard C library call. And yes, you can load millions or even hundreds of millions of ticks per second on a commodity quad core if you know what you are doing and are willing to get your hands a little dirty. Just wanted to clear up those two points.
As an aside, you it's true you don't necessarily need to be able to do this to be successful at trading. In the same sense that you don't need a computer to do accounting and end of year taxes. You could use paper and pen, or an abacus. But it certainly makes certain processes a lot smoother.
technically there isn't any limitation why one should not achieve to load tens of millions of ticks, at least the limitation at the moment is not posed by throughput on the memory, bus, or cache side. Given that dated 1066Mhz main memory has a throughput of about 7gb/sec, L3 3x the one of main memory, L2 1.5x of L3, and L1 1.5x of L2, neither memory nor bus throughputs pose a serious challenge to loading many tens of millions of data points. The work involved to deserialize data, for example, and other computationally expensive operations that tax the CPU or GPUs on the other hand heavily depends on the quality of software implementations of algorithms.
But those points are moot because the bottleneck from my experience is not the loading, ordering/sorting of ticks but the actual time and resources spent on operating on the actual algorithmic strategies. (I strictly limit the discussion to iterating over historical tick based data and not at all digress into handling live data feeds).