How big is x1 day S&P 500 E-Mini data file for tick (L1) and LOB (L2)

Can somebody with a good quality data feed please tell me how much data E-Mini S&P 500 future produces in x1 day?

I need separately
- L1 tick data, like Timestamp milliseconds, Bid Price, Ask Price, Bid Size, Ask Size.
- L2, limit order book, aka LOB,

thanks in advance
 
invest time into a good ai

To determine how much data the E-Mini S&P 500 (ES) futures contract produces in one day, it's important to consider both Level 1 (L1) tick data and Level 2 (L2) order book data. The exact amount can vary depending on market conditions, such as volatility and trading activity. However, here's a general idea of what to expect:

1. Level 1 (L1) Tick Data:
  • Components: Timestamp (milliseconds), Bid Price, Ask Price, Bid Size, Ask Size.
  • Data Rate: During active trading periods, the ES futures contract can generate thousands of ticks per minute. On a highly active day, it could easily exceed 1 million ticks.
  • Size Estimate: Each tick might include approximately 50-100 bytes (depending on how the data is formatted and stored).
  • Total Data Per Day:
    • If we assume 1 million ticks in a day, the L1 data might be in the range of 50-100 MB per day.
2. Level 2 (L2) Limit Order Book (LOB) Data:
  • Components: Full order book data, including all visible limit orders at various price levels.
  • Data Rate: The L2 data is much more voluminous because it includes updates to the entire order book, not just the best bid/ask. LOB data can easily reach several hundred updates per second.
  • Size Estimate: Each update could be around 500-1000 bytes (again, depending on format).
  • Total Data Per Day:
    • Given that LOB updates occur very frequently, especially during volatile periods, the total data volume can easily reach several GBs per day. For instance, during peak hours, you might see around 5-10 GB of data.
Summary:
  • L1 Tick Data: Around 50-100 MB per day.
  • L2 LOB Data: Several GBs per day (e.g., 5-10 GB).
These are rough estimates, and the actual data size could be higher or lower depending on the specific day's market activity. For precise figures, accessing historical data from a reliable market data provider would be necessary.
 
ninjatrader ES days are around 50 MB. Full order book, updated every second.
 

Attachments

  • Size.PNG
    Size.PNG
    58.1 KB · Views: 11
Thanks @NorgateData and @2rosy.

You can indeed see this on our site. From Aug 19, 2024:
  • Last sale only (Trades): 12.75 MB
  • L1 (MBP-1): 388.82 MB
  • L2 (MBP-10): 2.44 GB
  • L3 (MBO): 457.49 MB

There are a few nuances to point out:
  • A majority of book updates happen at top of book, because participants are rarely incentivized to cancel or modify orders deep in the book when they can sit idle collecting better queue priority.
  • A better estimate of distribution of order book activity is to compare MBP-1 to MBO. Our MBP-10 data is bloated because it's a two-sided snapshots of all 10x2 levels, rather than an incremental delta of the level that changed—if you want incremental, we point you to MBO instead.
  • Our MBP-1 updates are still two-sided but that bloat is small, so ballpark MBP-1/MBO size is telling you that ~80% of activity is on top of book. i.e. If you looked at another vendor that disseminates L2 incrementally like Nanex, you'd find L1 is probably 60~80% the size of L2.
  • The numbers I cited are for the entire ES futures product group. We follow CME's convention that outrights and spreads are all futures by definition, i.e. tag 167-SecurityType='FUT'. So my size estimate includes all expirations for all outrights like ESU4, ESZ4, ESH5; spreads like ESU4-ESZ4.
  • These numbers are all before compression. Our compressed numbers are about 31% of above. We always recommend using inline compression when working with any kind of market data other than daily frequency data.
  • Our normalization format has more entropy (content) than the format you've described. e.g. It has nanosecond timestamps, 3 different timestamps per event, and some raw fields like sequence number and secondary flags. Most vendors will provide a slightly lossier format and so their data will seem smaller.
 
Thanks everybody for detailed response.

I have a small amount of L2 ( aka LOB ) data and its crazy how easy is it spot edges in it. Only thing is they fade in and out depending on the outside market conditions.

But its a new type of programming that is required. Its much easier to code with just indicators and levels, then to code with constantly shifting lists of numbers.
 
Last edited:
The size of a single day’s S&P 500 E-Mini data file can vary significantly depending on the level of data granularity you're working with—whether it's tick data (L1) or the Limit Order Book (LOB, L2).

For tick data (L1), which includes every trade that occurs, you’re looking at a data file that typically ranges between 100 MB to 300 MB per day. This can fluctuate based on market activity; high volatility days, like during earnings reports or major economic announcements, can lead to larger files.

When it comes to LOB data (L2), which includes all the bids and asks at various price levels, the data size increases substantially. A full depth LOB data file can be in the range of 2 GB to 5 GB per day, again depending on market conditions and the number of orders and cancellations.

These sizes are based on uncompressed data files. Compression techniques, such as zipping, can reduce these sizes by around 50% or more, depending on the specific dataset and its redundancy.

If you're dealing with this data regularly, I recommend ensuring you have a robust storage solution and considering strategies for efficiently processing and analyzing the data.
 
Back
Top