Hi looking for some suggestions for how to store minute resolution data in memory for a better performance for backtesting. I am writing my own event-based backtesting framework in python, just for fun. The data is read into memory as a whole, stored as a pandas dataframe and then indexing will be performed with .loc[] on this dataframe during the backtest main loop.
The dataframe has two levels of multiindex, the datetime level and the symbol level. And on every time change, I will have to perform an indexing operation to get the historical 1 min data for certain stocks in certain time periods(the time periods is monotonically increasing).
My main issue is that the pandas indexing/slicing operation is way too slow and I know there is no pefect remedy for this operation.
So I wonder if there is any suggested alternatives for those kinds of purposes, maybe I should not store my minute data as a pandas dataframe. Is numpy ndarray or structured array a better choice? But then there will be no easy solution for time series indexing.
The dataframe has two levels of multiindex, the datetime level and the symbol level. And on every time change, I will have to perform an indexing operation to get the historical 1 min data for certain stocks in certain time periods(the time periods is monotonically increasing).
My main issue is that the pandas indexing/slicing operation is way too slow and I know there is no pefect remedy for this operation.
So I wonder if there is any suggested alternatives for those kinds of purposes, maybe I should not store my minute data as a pandas dataframe. Is numpy ndarray or structured array a better choice? But then there will be no easy solution for time series indexing.