I'm a software developer and I've created automated trading tools (in Java) that download, store and analyse historical data from various data vendors. Before this data hits my systematic trading system I run it through a cleaning process which removes and fixes dubious data like outliers, missing days, etc. What amazes me is that the quality of some of this data is pretty good and some of it is amazingly bad. What amazes me even more is there are a lot of people out their that don't clean their data, they just assume it is correct!
I'm interested to hear how others clean data before using it in their trading systems?
I'm interested to hear how others clean data before using it in their trading systems?
Last edited: