How do I handle historical data for my c# program?

NetTecture · Jan 30, 2013

Quote from SpaceCuddle:

Thanks for your reply, bluematrix, and to others as well.

The reason I don't use lists is because I have a lot of historical data and I need to be able to access any part of it in a reasonably fast manner. A list would be very slow I would think unless I need all the data starting from element one, because a list has to iterative over itself to get to any element.

Additionally I never have to add or remove anything so I guess I'm doing things differently from what you have in mind.

About storing the data in a database. Originally I had thought that one would repeatedly ask the database for more data once it would be needed, but I now realize that would be incredibly stupid and slow. Of course one should (as I do) have internal storage for the data like in an array or list. One just loads the data in first via a database, which would be nice. So I might do that.

I'm not a fan of Python in the slightest. But I'm looking at R and I'm taking 2 Coursera classes on R and data analysis. It's just to get a feel for different industries to learn from their preferred methods.

Anyway I'm not sure where I'm going with this and I don't think I need more help. There is no magic solution, I'm just going to remove all error prone manual stuff and error prone code and implement something better that matches time frames for me.

Thanks everyone!

Database is not only small, it is also incredibly space wasting. Once you deal with real data amounts that is a serious difference.

For example I use Nanex tapes. A Day has about 1.5 gb data - sadly in SQL form it uses about 50gb. HUGH difference.

So we extract the tapes into files with the ata wen eed, using a highly compresssing and read speed optimized binary format (variable length, a trade can be down to one byte, that includes tick accurate timestamp, volume, price). Then run along the files in playback etc. Keep in memory only what is needed - especially when you do algo the need for historical access is close to zero (except loading data, but you never buffer it).

Pippi436 · Jan 30, 2013

1 byte? Sounds like an awesome format. Care to share some details?

NetTecture · Jan 30, 2013

Quote from Pippi436:

1 byte? Sounds like an awesome format. Care to share some details?

Iti s not so hard.

byte 1: 2x4 bit.

first 4 bit: packet type, which is trade
2nd 4 bit: option length, which is 0

Decoding: Trade, same teimstamp, same price, same volume like last one (which happens quite often).

Otional fields are:
* TIme delta (in "ticks", predefined in another message, our feed has 25ms there as granularity)
* Price (as integer in ticks, against predefined, this is a delta (1 byte normally enough)
* Value, against as delta normally
* Security (to change between securities)

Not that hard. You have to understand that most things are not jumping around like mad. Depending on market it is quite normal to get similar priced events quite often. Decoding is cheap.

How do I handle historical data for my c# program?

NetTecture

Pippi436

NetTecture