Journal Entry
TL;DR: I now have my own offline analysis pipeline, failed somewhat with zipline, need access to good data as a result of not being able to work with zipline.
Education
I've been spending more time learning Pandas and other Python-based data analysis tools. I am very excited to become more proficient with them.
Progress on system
Success: avoided over-engineering!
Success: I've created my own process for analyzing daily OHLC reaching back to 1994 using Python, Pandas and rolled my own version of Jupyter/IPython. This process is source controlled, and does not depend on any external services. Backtesting will be through
Backtrader but I have not yet done anything significant with it, past the demos.
Failure: I could not use zipline from Quantopian because was pretty buggy as the most basic thing copied-and-pasted from
their own documentation resulted in exceptions preventing a backtest from running, period. I went to submit a bug on their github and found a graveyard of open PRs and bug reports. This is clearly not a well-run open source project, even though the code is regularly updated. No surprise there. Very few people like me will use zipline and then graduate to Quantopian. It's a good business decision IMO. However, I made the decision not to depend on it because it is too magical for me for the most part. I tried to fix the bugs I found but the code is very obtuse in places, which is difficult to accomplish in Python. The API is elegant, however. It is a loss, but I'm not going to cry about it.
Thoughts on not using Quantopian/Zipline
This is on the back of my mind: why are you ignoring Quantopian? Primarily: to avoid lock-in and understand every bit of software that I am using. There might be a 3-4 month period ramping up while I get my own analysis, testing, hosting and data pipelines in place. I can afford that time but man is it ever gnawing at me. The only thing that I can do to keep myself in check is have weekly milestones.
Next steps
SYSTEM: I've got daily OHLC data going back to 1994 that I can use to play around with. I've got a backtesting system that I can use. Time for the rubber to meet the road.
DATA: Since I would like to try my hand at algorithmic day trading, I would want, at the very least, minute OHLC. I went spelunking on the Internet to look for some good data sources and found one, but they doubled their prices in the last week! I am not afraid to spend a little money on gaining access to good data, though I would want to be able to shuttle the data around my own systems and not be locked into some vendor's idea of how I can use the data.
This may be a difficult problem to solve.
Journal entry
Previous entry
Education
Not exactly related to trading, but I recently got accepted into a Masters program at a top 10 university for computer science (Artificial Intelligence). I have enough money to fund myself for a couple of years to learn both trading and finish my Masters plus I have a (non-consulting) business that brings in a little something every month. Don't want to overwhelm myself though. Plus that never-ending divorce is still on my plate.
Thoughts on my system
I know I said I'd set up a notebook for back testing (and I did go down this path) but as an engineer, it is incredibly hard to give up control to Quantopian. I also find the interface very inefficient. It's like an uncanny valley between coding and excel. I'd like to think that I'm being cautious and avoiding over-engineering, but perhaps the readers of this journal will tell me I'm wrong. It seems though that the more successful folks have their own in-house things anyway.
What do I need?
- A backtesting system
- A live trading system that can be connected to a paper trading account
- A standard interface for processing data from a variety of datasets
- Tools to analyze and look at my trade ideas
The funny thing is that Quantopian have already provided 1-3 in
zipline and 4 is provided with IPython/Jupyter. The only question is whether I can use the same data I get from zipline in these notebooks.
So... Now I'm going to give myself a week to get backtests and research running with zipline + IPython/Jupyter. If I am unsuccessful, I will probably revert to Quantopian.
Success will be determined by the following:
- A conversion of some mickey mouse Quantopian demo algo to zipline along with backtest and result analysis
- Some sort of equally mickey mouse analysis of minute bars over a 3 year period
Until next time.