After some more research on data I've realised that document storage via NoSQL isn't the best solution for time series data. (They're great for storing events but that's another subject)
The way I see it right now there are three options
- traditional relational database (SQL)
- specific time series database such as InfluxDB
- roll your own binary database
Of those three options I'm thinking of going with the venerable MySQL, it's been around for a long time and can be accessed by many other languages such as R and Python and other applications. So it's a flexible, robust and multi-purpose type solution. I'm thinking of using that database purely for back testing type operations where I'd hold the data from TickData or IQFeed etc. for the 'operational makret data' that which is being traded on, I'd probably just download the last say 1000 bars from the broker to update the indicators and get the algos running then just aggregate bars from incoming ticks/quotes from there (FIX). That would be pulled down on application close, the persistent market data wouldn't be polluted by it.
Does anyone have any experience with using MySQL for market data? I tend to just work with minute bars and aggregate them into 5 min, 15 min, 30 min etc bars from there depending on the strategy.