I am looking for a database solution to query my EOD stocks and options data.
Below is a description of my data set and of my requirements for the DB system.
It would be great if you could suggest a DB system which might be suitable for this purpose. Thank you very much in advance.
DESCRIPTION OF THE DATA SET:
-----------------------------------------
STOCKS DATA TABLE:
- Only EOD data
- Contains EOD prices & volume data for ~10,000 stocks --> about 30,000,000 records in total
- About 15 years of EOD data
OPTIONS DATA TABLE:
- Only EOD data
- Options on about 5,000 underliers
- Data set contains ~100,000,000 record (--> a records is one quote per option per underlyer per day)
- About 10 years of EOD data
REQUIREMENTS FOR DATABASE SYSTEM:
---------------------------------------------------
- Integrates well with Python or MATLAB: Easy to read data into Python or MATLAB (very important) and easy to write from these programs back into the DB (a bit less important)
- Integrates well with Excel (very important): Ideally there is already an Excel add-in which adds functions to Excel which allow me to read data directly from the database into the relevant cell / array in Excel (similar to what the Bloomberg [such as =BDP, =BDH] and Reuters add-in are doing)
- Database does not require a lot of "low level" work and maintenance
- Ideally the database is free and open source
- Performance is not a major issue: Since I am only using EOD data, performance is not my major concern, the above mentioned integration with Excel/Python/MATLAB is more important. Nevertheless, the DB should be performant enough to run "SELECT" queries (such as filtering the options data set by the underlying stock ID) in a reasonable amount of time (<10 sec).
- It's fine if the DB is non-SQL: Given my bad experiences with SQL databases (MySQL and MS SQL Server) that I have used for far for this purpose (SELECT queries were extremely slow on the ~100m records options data set, some times took >30 min to execute), I would be happy to also consider non-SQL solutions.
- DB can store multiple fields for the same time series: For example in my stock price table I would like to be able to store both price and volume data for the same day.
Below is a description of my data set and of my requirements for the DB system.
It would be great if you could suggest a DB system which might be suitable for this purpose. Thank you very much in advance.
DESCRIPTION OF THE DATA SET:
-----------------------------------------
STOCKS DATA TABLE:
- Only EOD data
- Contains EOD prices & volume data for ~10,000 stocks --> about 30,000,000 records in total
- About 15 years of EOD data
OPTIONS DATA TABLE:
- Only EOD data
- Options on about 5,000 underliers
- Data set contains ~100,000,000 record (--> a records is one quote per option per underlyer per day)
- About 10 years of EOD data
REQUIREMENTS FOR DATABASE SYSTEM:
---------------------------------------------------
- Integrates well with Python or MATLAB: Easy to read data into Python or MATLAB (very important) and easy to write from these programs back into the DB (a bit less important)
- Integrates well with Excel (very important): Ideally there is already an Excel add-in which adds functions to Excel which allow me to read data directly from the database into the relevant cell / array in Excel (similar to what the Bloomberg [such as =BDP, =BDH] and Reuters add-in are doing)
- Database does not require a lot of "low level" work and maintenance
- Ideally the database is free and open source
- Performance is not a major issue: Since I am only using EOD data, performance is not my major concern, the above mentioned integration with Excel/Python/MATLAB is more important. Nevertheless, the DB should be performant enough to run "SELECT" queries (such as filtering the options data set by the underlying stock ID) in a reasonable amount of time (<10 sec).
- It's fine if the DB is non-SQL: Given my bad experiences with SQL databases (MySQL and MS SQL Server) that I have used for far for this purpose (SELECT queries were extremely slow on the ~100m records options data set, some times took >30 min to execute), I would be happy to also consider non-SQL solutions.
- DB can store multiple fields for the same time series: For example in my stock price table I would like to be able to store both price and volume data for the same day.