Hi folks.
Looking for a software/database/method to accurately & easily work with upwards of 200-million row x 3-5 parameter column data set.
Looking to essentially build data set of *empty* tick bars (just open, close price). To build empty 1500 tick equates to pulling out and save every 1500th/1501th row. This results in a 200 million row x 3 column raw tick data set being reduced to ~266,666 row x 3 column.
Example:
Row, Date, Time, Price
Row1 4/1/2010 08:30:05 1202.25 <- Save
Row2 4/1/2010 08:30:05 1202.50 <- Remove
Row3 4/1/2010 08:30:05 1202.25 <- Remove
..... <- Remove
..... <- Remove
Row1500 4/1/2010 08:31:02 1201.50 <- Save
Row1501 4/1/2010 08:31:02 1201.50 <- Save
......
......
Row3000 4/1/2010 08:31:58 1201.25
Row3001 4/1/2010 08:31:58 1201.00
.....
.....
Row4500
Row4501
.....
..... etc
Row150,000,000
Row150,000,001
Would like final manipulation as follows:
Date, Time (Bar Close), Open, Close
4/1/2010 08:31:58 1201.50 (Row1501 Price) 1201.25 (Row3000 Price)
^This format will eventually make its way to Matlab.
Can Tickdata's Tickwrite script help here?
Want to be able to easily switch to a removal criteria such that I can build empty 750,1000,2500 etc tick bars from raw tick.
Data is either in ascii format or txt file from original source.
Any input much appreciated.
Looking for a software/database/method to accurately & easily work with upwards of 200-million row x 3-5 parameter column data set.
Looking to essentially build data set of *empty* tick bars (just open, close price). To build empty 1500 tick equates to pulling out and save every 1500th/1501th row. This results in a 200 million row x 3 column raw tick data set being reduced to ~266,666 row x 3 column.
Example:
Row, Date, Time, Price
Row1 4/1/2010 08:30:05 1202.25 <- Save
Row2 4/1/2010 08:30:05 1202.50 <- Remove
Row3 4/1/2010 08:30:05 1202.25 <- Remove
..... <- Remove
..... <- Remove
Row1500 4/1/2010 08:31:02 1201.50 <- Save
Row1501 4/1/2010 08:31:02 1201.50 <- Save
......
......
Row3000 4/1/2010 08:31:58 1201.25
Row3001 4/1/2010 08:31:58 1201.00
.....
.....
Row4500
Row4501
.....
..... etc
Row150,000,000
Row150,000,001
Would like final manipulation as follows:
Date, Time (Bar Close), Open, Close
4/1/2010 08:31:58 1201.50 (Row1501 Price) 1201.25 (Row3000 Price)
^This format will eventually make its way to Matlab.
Can Tickdata's Tickwrite script help here?
Want to be able to easily switch to a removal criteria such that I can build empty 750,1000,2500 etc tick bars from raw tick.
Data is either in ascii format or txt file from original source.
Any input much appreciated.
