Quote from bs2167:
Agree with the approach overall - in my experience, being able to quickly test a novel idea far outweighs the disadvantages of having to develop/maintain an additional code base for live trading (which very rarely changes).
I've never used Python...if you have a minute, would be interested to get your take on what it does better / worse than R that makes it worthy of inclusion in your process.
I have debated with myself extensively when I decided to include python into my workflow. There are a few reasons I chose to do so.
First, the ability to create a class hierarchy in python means that I can create a quick library of reusable code that is well-organized. In theory I could perhaps do this in R by building an R package, but I never really learned this particular aspect of R, and also I could be wrong but the R package framework doesn't appear to be completely object-oriented.
Second, most of what I do is not quickly amenable to optimal vectorized codes unless I thought and planned hard about it. So I end up writing loops around minute bars anyway. In this case, python (+numpy) in my experience is much much faster than R.
Third, I have all my data stored in a mysql database (both market data, and performance data of my models). I found that I could quickly write a "Database" class that wraps many of the mysql query and data loading functions I need for backtesting in python whereas I felt it was a little more cumbersome to do the same in R (of course, again if I learned how to build an R package then I probably could have done this in R).
Fourth, I have also built a web server to monitor my live / forward-test trading. I have the java code running, and then I have a web server that checks the log file of the running code to show me the PL, holdings, and even charts and graphs. I implemented all of my web server scripts into my WebServer python class that also uses many of the database query and load functions. I also wrapped a lot of the plotting functions using matplotlib.
I guess a fifth important reason is that I am very comfortable programming in python, and I can do things extremely quickly in it. I am also comfortable with R except for the package part, which means I can write re-usable code in python much better.
At some point before I was doing this in python, I had hundreds of R scripts scattered about, and I had been re-writing very similar stuff over and over. So this is when I decided I need to start re-using code.
In practice, building a code base in python for me is not a hugely organized and extensively planned endeavor either. All I do is this: I create a base-class that I think conceptually is necessary. I implement the minimally required methods in there. Then in a particular analysis or backtest code, I instantiate that class, and use it's methods to implement what I need. After doing this a few times, I will recognize that there might be a group of similar operations I repeatedly do on the data in my base-class. That is the time that I "elevate" that code to a member function of that class. So slowly as time goes by, I start accumulating useful functions in the form of methods of that class. I also try to make it a habit to write a simple sentence in the triple quote remark section of the function so that later, I can extract it into a set of webpages via pydoc or sphinx.