Quote from SmartQuant:
- objects/second IO depends on the kind of objects you IO. In some cases 10 objects per second on Athlon64 is a very good performance 
- try to store TAQ data in an SQL db and have a look at compression / performance issues even on Athlon64
- you can use different compression levels tuning speed vs compression, I believe you will get your 10M ticks / second on Athlon64 with fast bus and HDD if you run without compression
- you can still query compressed data
- indeed you can compress trades into bars and have a few hundreds of bars per day instead of tens of thousands of ticks but I was talking about ticks
- perhaps there are better solutions. Could you send a link please together with pricing info. KX for example?
You appeared to be talking about tick or bar data, not some massive objects.
How you store the TAQ data is dependent on how it will be used. Beyond archival purposes, compression is only useful to the point at which it improves throughput from disk without too much CPU overhead to decompress... for the desired usage patterns. NYSE TAQ is a huge amount of data, though I have seen bigger sets. Many applications will process it differently and may require transposes. Of course you can query compressed data. I do it all the time. You can also use DB, flat files, or hybrids of the two. Everything depends on expected usage patterns.
You originally boasted about how much you can compress TAQ data. I don't see the relevance unless you're concerned about the price of hard drives. If you were concerned about efficiency and speed you would have mentioned bandwidth instead, maybe even preprocessing too. Are you able to increase the query bandwidth over flat uncompressed, indexed TAQ data? If yes, by how much? For what query patterns or types of usage? Even if you compress the original TAQ data it may make no difference, depending on the application.
I/O bus and HDD speed are not requirements for processing 10M or 100M ticks/second or bars/second. Testing can be made to fit nicely within memory (low page faults), minimizing cache misses. The key is to test multiple models or parameters simultaneously. Otherwise you will have cache misses, will be memory bound and/or disk bound (plus DB overhead) and thus seeing 1/10 to 1/1000 your CPU's potential.
Everything I've said applies equally to ticks and bars.
My software is proprietary.