DATA FEED: Totally Shocked!!! Who Can You Trust???

Hello, All:

I was truly SHOCKED to find the great disparity between the 1-minute data from DTN (IQFeed) and that from Quote.com/QCharts.com

For example, for AAPL (Apple Inc.), you get the following from DTN:

Date,Time,Open,High,Low,Close,Volume
20070702,09:30:00,121.0500,121.1200,120.7000,120.9600,1091368
20070702,09:31:00,120.9700,121.5000,120.7200,121.5000,387602
20070702,09:32:00,121.5000,121.7200,121.0100,121.4200,359397
20070702,09:33:00,121.4100,121.4983,121.2500,121.3400,230790
20070702,09:34:00,121.3400,121.3600,120.5000,120.5150,441636
20070702,09:35:00,120.5000,120.7100,120.4100,120.4400,368037
20070702,09:36:00,120.4400,121.3900,119.8100,119.8400,582012
20070702,09:37:00,119.8400,120.0000,119.6600,119.8400,497708
20070702,09:38:00,119.8500,119.8600,119.3000,119.5500,440919
20070702,09:39:00,119.5700,120.0300,119.3400,119.9000,511771
20070702,09:40:00,119.9000,120.4500,119.8700,120.3300,400258
20070702,09:41:00,120.3300,120.3800,119.9300,120.2000,265160
20070702,09:42:00,120.2100,120.4200,120.1000,120.1500,168681
20070702,09:43:00,120.1400,120.2900,120.0500,120.1700,180477
20070702,09:44:00,120.1700,120.3400,120.1200,120.1800,190279
20070702,09:45:00,120.1700,120.2000,120.0000,120.0300,178249
20070702,09:46:00,120.0200,120.1300,119.9800,120.0800,201295
20070702,09:47:00,120.0800,120.4400,120.0200,120.3595,197620
20070702,09:48:00,120.3600,120.4000,120.3000,120.3500,154873
20070702,09:49:00,120.3500,120.8000,120.3500,120.8000,346926
20070702,09:50:00,120.8095,121.0000,120.4150,121.0000,299239

Now, compare that with the data covering the same stock and time period from Quote.Com, shown below:

20070702,09:30:00,121.08,121.12,120.7,120.94,203522
20070702,09:31:00,121.04,121.5,120.835,121.47,149807
20070702,09:32:00,121.497,121.72,121.01,121.3,134121
20070702,09:33:00,121.32,121.49,121.25,121.35,91263
20070702,09:34:00,121.35,121.41,120.56,120.61,152710
20070702,09:35:00,120.55,120.71,120.44,120.56,118052
20070702,09:36:00,120.5501,120.5501,119.83,119.84,229697
20070702,09:37:00,119.85,120,119.66,119.86,178190
20070702,09:38:00,119.87,119.87,119.3,119.399,204212
20070702,09:39:00,119.54,120.03,119.36,119.96,160439
20070702,09:40:00,119.97,120.45,119.88,120.39,143295
20070702,09:41:00,120.39,120.409,119.93,120.19,84366
20070702,09:42:00,120.18,120.42,120.1,120.13,86411
20070702,09:43:00,120.13,120.29,120.05,120.181,61818
20070702,09:44:00,120.18,120.339,120.13,120.18,79605
20070702,09:45:00,120.1895,120.2,120,120.018,45673
20070702,09:46:00,120.01,120.13,119.98,120.09,95859
20070702,09:47:00,120.09,120.42,120.03,120.4,56599
20070702,09:48:00,120.39,120.4,120.3,120.36,82845
20070702,09:49:00,120.36,120.8,120.35,120.8,139759
20070702,09:50:00,120.8,121,120.79,120.99,113031

Notice the big differences, especially volume? Surprising, isn't it? Obviously, both can NOT be right. Or may be BOTH are wrong?!!

I wonder why? Any ideas? Any explanations?

If you had to pick one or the other, which would you pick? Or is there a third option?

I like to see what users of other datafeeds find. Let's compare notes.

Regards
ET
 
If you compare any 2 data feeds you will get differences like these. Sometimes your data feed will be the right one and sometimes it won't. If it's 50/50 than you're okay. No data provider is always right or wrong.

Trust me I have been down this road before...
 
Quote from I$land:

If you compare any 2 data feeds you will get differences like these. Sometimes your data feed will be the right one and sometimes it won't. If it's 50/50 than you're okay. No data provider is always right or wrong.

Trust me I have been down this road before...
I don't expect perfect match.

I'll even tolerate differences in O,H,L,C, if they are not too large.

I'll even tolerate up to 30% difference in volume.

But an error margin of 60% to > 100% or more in volume is just too much.

So, we've huge error margins. The only solace is if the error margins are even consistent. If they're not, then the data is only worth a fraction of its purported value, and one doesn't even know how big that fraction is, 10%, 30%, 50%?

Regards.
 
Tradestation: I suspect TS is reporting the data at the close of the each minute (eg close of 0930 is reported as 0931) while the others are reporting the data as if it is the beginning of each minute.

Please note that if you start the TS data at 07/02/07, the volume will be zero for the first minute, but if you start the TS data at 06/29/07, the volume will be 1098451 for the first minute, presumably due to trades outside of RTH.

if started data over the previous weekend 06/29/07
07/02/2007,0931,121.08,121.12,120.70,120.97,831018,267433,1098451.00


"Date","Time","Open","High","Low","Close","Up","Down","Volume"
07/02/2007,0931,121.08,121.12,120.70,120.97,831018,267433,0.00
07/02/2007,0932,120.96,121.50,120.88,121.50,218784,164185,382969.00
07/02/2007,0933,121.50,121.72,121.21,121.42,189072,169675,358747.00
07/02/2007,0934,121.41,121.45,121.25,121.34,98246,126553,224799.00
07/02/2007,0935,121.34,121.36,120.50,120.50,174740,267586,442326.00
07/02/2007,0936,120.50,120.71,120.41,120.44,160598,204964,365562.00
07/02/2007,0937,120.44,120.46,119.81,119.82,209969,342614,552583.00
07/02/2007,0938,119.82,120.00,119.66,119.84,266834,229209,496043.00
07/02/2007,0939,119.86,119.86,119.30,119.57,193214,242405,435619.00
07/02/2007,0940,119.57,120.03,119.34,119.88,279953,233018,512971.00
07/02/2007,0941,119.89,120.45,119.88,120.33,225920,172938,398858.00
07/02/2007,0942,120.33,120.38,120.05,120.20,122790,136640,259430.00
07/02/2007,0943,120.21,120.27,120.10,120.14,78711,90070,168781.00
07/02/2007,0944,120.14,120.29,120.05,120.17,91967,88610,180577.00
07/02/2007,0945,120.17,120.34,120.12,120.18,97036,93043,190079.00
07/02/2007,0946,120.17,120.20,120.00,120.03,82100,96149,178249.00
07/02/2007,0947,120.02,120.13,119.98,120.08,77434,123861,201295.00
07/02/2007,0948,120.08,120.44,120.02,120.40,111266,85604,196870.00
07/02/2007,0949,120.35,120.39,120.30,120.35,80889,73334,154223.00
07/02/2007,0950,120.35,120.80,120.35,120.80,228543,113283,341826.00
07/02/2007,0951,120.81,121.00,120.79,120.99,190437,103402,293839.00
07/02/2007,0952,120.99,121.01,120.69,120.97,144767,128611,273378.00
07/02/2007,0953,120.88,120.95,120.72,120.77,59413,83397,142810.00
07/02/2007,0954,120.81,120.85,120.58,120.71,186023,144122,330145.00
07/02/2007,0955,120.69,120.83,120.66,120.81,49981,55274,105255.00
07/02/2007,0956,120.80,120.82,120.66,120.73,45926,68285,114211.00
07/02/2007,0957,120.75,121.13,120.68,121.11,200952,162194,363146.00
07/02/2007,0958,121.13,121.13,120.83,121.02,144476,137511,281987.00
07/02/2007,0959,121.05,121.34,121.00,121.28,141705,106981,248686.00
07/02/2007,1000,121.28,122.09,121.14,121.70,247518,148334,395852.00
 
FWIW, Tradestation shows totally different volume numbers for the same period.

9:30 (eastern) - ~55k shares
9:31 (eastern) - ~84k shares

Differences in regional exchange inclusion may account for major differences. Minor differences can be accounted for by the exact timing of when one minute starts and ends in their aggregation scheme.

If you're really hard up for accurate numbers, compare ticks during that period and see whose missing what data.
 
OK, may be AAPL was not the best for comparisons.

One should pick a stock that is traded, if possible, on one exchange only, and preferably highly active, i.e. high volume.

I suspect this would be a stock traded on either AMEX or NYSE.

Besides the stock (ticker symbol) we finalize on, I am also open as to which day and which time window. Let's keep it to a 30-minute window of 1-minute data, just so that it won't take up too much space.

A recent date is fine, although I wouldn't use the latest date since there may be trade corrections or false spikes that different vendors may take different amounts of time to correct. So something like a month ago or older would be better. I have no problem using July 2, 2007 (i.e. first trading day of July, 2007, 6 months ago) as any errors would have been corrected by now by all data vendors (if they do correct errors).

As for time window, the first half hour of trading would seem to be a good choice, e.g. From opening bell to 10:00 a.m. New York Time. I suggest using 1-minute data from opening bell of 9:30 to 10:00 a.m., unless someone has a better suggestion.

So for now, tentatively, Date & Time Window is:
Monday, July 2, 1007, from opening bell (9:30 a.m.) to 10:00 a.m., New York Time.

Obviously, the most important thing is to decide on the best stock (i.e. the ticker symbol) for this comparison project. Is SPY, or DIA a good choice?

Does someone have a better suggestion? I am all ears.

As soon as we can agree on the stock, and if the time window above is acceptable, we can then post the data results and compare.


Regards.
 
Back
Top