Searching historical data in Sql

Kevin Schmit · Feb 17, 2008

Quote from black diamond:4)sql for smarties by selko is/was a good reference ... [/B]

That would be Celko. With a "c."

morphtrade · Feb 17, 2008

I may look at the option of saving historical data in binary files but will it not take long time to load a big binary into memory and searching through them unless there is a tool to stream binary data (without loading a whole file) and query them. I would definitely like to explore that area if someone points me to right tools or ideas.
Table structure for one min data:
[Date] [Time] [Open] [High], [Low], [Close], [Adjusted_Close], [MA], [DI].....

Tick Data:
[Date] [Time] [Trade]
I have clustered index on [Time] and [Date] columns as they are primary key.
Most time consuming queries are with lots of inner joins. So for example if I have to compare first few mins data then I have to do inner join like:

With IntervalData AS
(
SELECT [Date], Sum(CASE WHEN 1430 = [Time] THEN [PriceRange] END) AS '1430',
Sum(CASE WHEN 1431 = [Time] THEN [PriceRange] END) AS '1431',
Sum(CASE WHEN 1432 = [Time] THEN [PriceRange] END) AS '1432'
FROM [INDU_1] GROUP BY [Date]
)
--Select * from IntervalData order by [Date] DESC;
SELECT [Date] ,[1430], [1431], [1432], [1431] - [1430] As 'Range' from IntervalData
WHERE ([1430] > 0 AND [1431] < 0 AND [1432] < 0) OR ([1430] < 0 AND [1431] > 0 AND [1430] > 0)
------------------------------------------------------------------------

select ind1.[Time], ind1.PriceRange,ind2.[Time], ind2.PriceRange from INDU_1 ind1
INNER JOIN INDU_1 ind2 ON ind1.[Time] = ind2.[Time] - 1 AND ind1.[Date] = ind2.[Date]
where (ind1.[Time] = 2058) AND ((ind1.PriceRange > 0 AND ind2.PriceRange >0) OR (ind2.PriceRange < 0 AND ind1.PriceRange < 0))
ORDER BY ind1.[Date] DESC;

This query might end up with quite a few inner joins if you have to compare several mins of data and will severely effect execution time.

And for that reason, I might have to start looking at time series database. Any suggestion on which one I should start with?

ktmexc20 · Feb 17, 2008

Quote from shortorlong:

Can someone point me to a good vector DBMS? I cant seem to find anything. Wrong keyword perhaps?

Here's what I would argue is the best. It's used by very many in scientists (ie: remote sensing satellite data) as well as the engineering and aerospace industries. Not to mention me...

HDF5

andread · Feb 17, 2008

Quote from ktmexc20:

Here's what I would argue is the best. It's used by very many in scientists (ie: remote sensing satellite data) as well as the engineering and aerospace industries. Not to mention me...

HDF5

doesn't really say that much, does it? What's this technology they talk about?

rosy2 · Feb 17, 2008

Quote from ktmexc20:

Here's what I would argue is the best. It's used by very many in scientists (ie: remote sensing satellite data) as well as the engineering and aerospace industries. Not to mention me...

HDF5

i use the python interface to this. its fast enough for me and i can slice and dice as needed. for a commercial product look at

http://kx.com/

ktmexc20 · Feb 17, 2008

Quote from andread:

doesn't really say that much, does it? What's this technology they talk about?

Search around the site a little bit. Actually its a relatively new web-site and I didn't give you the home page. Here it is: HDF Group. Oh and of course, if I use it, it's Open-Source as well...

______________________________
Support OpenSource, OpenAccess, and OpenStandards.

-kt

Searching historical data in Sql

Kevin Schmit

morphtrade

ktmexc20

andread

rosy2

ktmexc20