I built a program once to scrape USDA reports at the instant they were released (+~50-200 ms, I think) and then "read" the report electronically, perform some calculations and issue trade instructions. It all took less than a second.
The profit was statistically significant, but not practically significant given the bid/ask spread. In other words, I could prove 'academically' that my model for predicting the direction prices would go based on the contents of the report was correct, but I couldn't make (much) money on it. The sharpe was very low. I abandoned it.
All I really proved to myself was that the market makers already did similar calculations and adjusted their bids and asks perfectly. As should be expected.
By the way, for that report, the initial source was a specific location and link on the USDA website that I hard coded into my program in advance. They always used the same link. If you wanted to do something similar, I would suggest looking at the BLS website: https://www.bls.gov/cpi/ I doubt there's a good edge there, though, at least for retail.
The profit was statistically significant, but not practically significant given the bid/ask spread. In other words, I could prove 'academically' that my model for predicting the direction prices would go based on the contents of the report was correct, but I couldn't make (much) money on it. The sharpe was very low. I abandoned it.
All I really proved to myself was that the market makers already did similar calculations and adjusted their bids and asks perfectly. As should be expected.
By the way, for that report, the initial source was a specific location and link on the USDA website that I hard coded into my program in advance. They always used the same link. If you wanted to do something similar, I would suggest looking at the BLS website: https://www.bls.gov/cpi/ I doubt there's a good edge there, though, at least for retail.
Last edited: