One important thing to consider about RDMS is that they allow you to organize things much better. Regarding ten years of tick data across every instrument on every exchange, very rarely will you need that level of detail. But on the other hand, if you develop a strategy six months from now you don't know what you'll need to be able to test it.
I've approached market center data with two minds:
1) Store everything you can get your hands on, so long as it doesn't slow down anything else you actually need. That is to say that pulling down global tick and market depth data across all instruments is terrific, but it shouldn't negatively impact your strategy's performance.
2) Your strategy(ies) should be hyper optimized and not really related to the storage of "everything". If you only need hourly data, figure out a way to make that as lean and laser focused as possible. Optimize hot code paths.
Having data in a RDMS means that you can shift and swivel it around very quickly. Depending on your hardware, vendor, and table/indexing strategies much of the storage is actually being placed in the buffer pool, held in RAM, negating the discussion (or at least greatly reducing the significance of) SSD vs spindles.
And if you decide six months from now that you want to make candles out of the whole mess and data mine MACD signals, it would take you ~10 minutes to write the query. Fire it off, grab some coffee, and your results will be in front of you.
I've approached market center data with two minds:
1) Store everything you can get your hands on, so long as it doesn't slow down anything else you actually need. That is to say that pulling down global tick and market depth data across all instruments is terrific, but it shouldn't negatively impact your strategy's performance.
2) Your strategy(ies) should be hyper optimized and not really related to the storage of "everything". If you only need hourly data, figure out a way to make that as lean and laser focused as possible. Optimize hot code paths.
Having data in a RDMS means that you can shift and swivel it around very quickly. Depending on your hardware, vendor, and table/indexing strategies much of the storage is actually being placed in the buffer pool, held in RAM, negating the discussion (or at least greatly reducing the significance of) SSD vs spindles.
And if you decide six months from now that you want to make candles out of the whole mess and data mine MACD signals, it would take you ~10 minutes to write the query. Fire it off, grab some coffee, and your results will be in front of you.