I have scraped a dataframe before of the Russell 3000 tickers and then had a column of true/false if that ticker is part of a major index. Then it is trivial to create some functions to return whatever index you want.
From doing this though I realize I don't really care about the bottom 2000 tickers and just look at IWM for that. Then I do care about the Russell 1000 and what ticker is part of what major index. More interesting is what is part of the Russell 1000 but not in the S&P 500.
I just don't see the point of SQL here. A more robust solution to a problem you don't really have.
I just think it is hard to beat a query on dataframes with a datetime index if the analysis is going to end up in a dataframe anyway.
I haven't worked with IB API though. Anything I have used has been trivial to get all symbols but I could see all symbols on IB being a bit much to say the least. Such a huge percentage of that though is going to be illiquid instruments that you will never bother with.
Based on the thread though I am still not sure what you are looking for exactly.
good moment to take care for IWV, a boolean or a 1-0 equivalent is the way to go for databases or whatever is your upstream source of data
I do not know how much you are familiar with algorithms and data structures, every use case depends, arrays, dataframes, list, push/pop, graphs, SQL,no SQL, LinkedList. Is all about big O
a dataframe vs a database are two different kind of beasts. Let's say you have ohlcv and then you want 30 indicators, and fundamental data, and you need to operate so called joins to merge two or more tables and want also to store performance, would you still more comfortable with datasets?
I would not be so comfortable if you need to establish relationships, if you need proven ways for querying and analysis on data aggregations, scalability if you have REALLY large volumes of data, optimization mechanisms, consistency of the data, atomicity, integrity.
If you need simple operations, and one is not able or do not see profitable to optimize SQL queries then storing CSV files can be a better solution, especially in terms of storage space(as you will avoid metadata)
There is also the streaming option, but for a retail use case, no HFT, does not make any sense imo.