anybody would like to collaborate on building an options backtest platform?

Quote from opt789:

Look at SPX for 9-23-03, 7-5-05 through 7-13-05, and 12-27-05 the data is missing, they don�t have it. Look at the top of the screen it says �No Data� and they just fill in the previous day�s data so if you aren�t paying attention you will use fake data and not realize it.
And on random days like 6-24-04 they are missing a whole bunch of the strikes.
If you used this data they you must have not checked it very well, or don�t need consistent and complete data for your tests.

I am not sure what you are asking about tick data. The EOD data on SPX alone from 2003 to 2010 is well over 100MB. To optimize an option trading idea you can�t have preconceived notions, you have to check the various possibilities and figure out what has the best risk/reward ratio for your personal trading preferences and tolerances. That means you have to check many different strikes and various months: which option spread do you trade, when do you trade it, how far away is it, how do you hedge it (with options or underlying) and when do you hedge it, do you roll, are you always in a trade or just sometimes, how are stops and fast markets handled? These are just a small sample of the various questions that have to be tested to optimize a trading idea. So back testing just on EOD data takes searching through a lot of data many, many, many times as the very large, multidimensional matrix of possibilities are tested. I honestly don�t know how much data it is for a decade of SPX option data that has each and every intraday bid and ask change of every option. If we say the EOD data is about 150MB and you just have a snapshot every minute then you would have 405 minutes per day so you would have 60 gigabytes of data for a decade of SPX options. I can search through a decade of EOD SPX data in Excel VBA in a matter of milliseconds because I can load it all into memory. Loading 100MB and searching it in just a few milliseconds is no big deal, but multiply that by 405 and that is a different story and that is just for 1 minute data.

Thanks for the heads-up. I used RUT and NDX data, but I'll check to see if I have the same difficulties. I was using 2005 through 2009 data.

I generally do my back-testing with a PHP application aimed at specific strategies. Sometimes I leave the data in CSV files, but usually I populate a data base. My only significant penalty for data set size is execution time. Although I used to be pretty good at VB6, it never really occurred to me to do it in VBA.
 
Quote from opt789:

Mizhael, the way you phrase your questions makes me wonder how much real life option trading you have done. As a former market maker and then firm trader that did over 100k options a month, I still don’t think I know a lot about options. There is so much to know and so many possibilities and nuances, that you have to be ridiculously smart and experienced to say you know a lot. If I was twice as smart and experienced as riskarb/atticus then I might say I know a thing or two about options. My point is, if you are trying to back test something you should have first hand direct experience trading that underlying's options in various types of markets (slow, fast, low vol, high vol, panic, etc.) before you can begin to hope to have useful insights into what the back testing is telling you. The midpoint of EOD bid/ask from a reliable source gives you the official mark of the option, and your experience trading those options will tell you where a reasonable fill can be obtained in a given market condition.

Hi opt789,

Thanks for your invaluable contribution to this thread as a former options MM and then firm trader. It's great to have you here!

I honestly admit I don't know much about options, other than a few real money trades.

But when I tested a bit, I found the bid-ask spread is so wide that assuming mid-price will turn a loss into a profit, which is unrealistic...
 
Quote from GTG:

A few years back I backtested some simple ideas using some very low quality EOD option price data. I don't exactly remember the results, but the main thing I remember was that most of the time I could take an entry/exit strategy that backtested profitably in the underlying, but would backtest at a loss using the options data unless I assumed that the bid/offer spread was not crossed. In fact, the bid/offer spreads were so wide, that I could take a fairly unprofitable backtest in the underlying and make it profitable by running it on the options data and assuming the bid/offer spread was not crossed...so my main take from those backtests was that it must be nice to be an options market maker. I also, had a sense that what I really needed was some intra-day data. When I inspected the eod data, it seemed that the spreads were wider than what I typically saw for options in those stocks during actual market hours. I think that just taking one sample a day, say five minutes before the close would be much superior to the EOD prices. Unfortunately, I think to get any intra-day data in options you are going to have to collect it yourself.

So the question is where to find long history of intraday data...
 
Quote from mizhael:

So the question is where to find long history of intraday data...

No, this is easy. I have a couple of resources for this.

Willing to pay four digits PER MONTH?

The real question is where to get CHEAP data, unless you are willing to pay.
 
I've been collecting various futures and equity data for about 1-2 years now, depending on the contract. I've also purchased several years of some futures contracts. I've been collecting options tick data for about a year on a few contracts, including SPY. I think with options, what is more valuable is the bid/ask, which I haven't been collecting.

My database is about 200 GBs at this point, and grows about 100 GB/yr, but I've started collecting a lot more data, so I've been growing a lot faster in the past 6 months. Something like the ES is pretty large but most other contracts aren't much more than about 5 GB/year.

Cheap tick data can be found at tickdata.com for about $100/year.
 
Quote from jedwards:

I've been collecting various futures and equity data for about 1-2 years now, depending on the contract. I've also purchased several years of some futures contracts. I've been collecting options tick data for about a year on a few contracts, including SPY. I think with options, what is more valuable is the bid/ask, which I haven't been collecting.

My database is about 200 GBs at this point, and grows about 100 GB/yr, but I've started collecting a lot more data, so I've been growing a lot faster in the past 6 months. Something like the ES is pretty large but most other contracts aren't much more than about 5 GB/year.

Cheap tick data can be found at tickdata.com for about $100/year.

long history and good quality for options on futures?
 
NxCore has complete options intraday historical data for $350 per month of historical data. Cheapest intraday options data I have found so far.
 
Quote from atticus:

NxCore is good, but backtesting on options-data is really a waste of time.

Thanks Atticus, could you expand at all on why that is? I've been planning to backtest using some protective puts/calls in place of stop losses on big indexes like the SPY, thinking that bid/ask spreads on higher volume options wouldn't be too bad. Our developer is still in early stages of designing/building the backtester though, and we haven't really gotten involved with options data yet.
 
Quote from GoldStandard:

Thanks Atticus, could you expand at all on why that is? I've been planning to backtest using some protective puts/calls in place of stop losses on big indexes like the SPY, thinking that bid/ask spreads on higher volume options wouldn't be too bad. Our developer is still in early stages of designing/building the backtester though, and we haven't really gotten involved with options data yet.

More complexity than it's worth due to curvature(s). Best to backtest the underlying an apply a vol-trade to a vol or directional forecast.
 
Back
Top