Anyone using Python on a large scale trading application?

Quote from 6yaNYCjm5m:

Well, it's always tricky to enter the conversation about the performance, because it is very difficult to make so many variables involved in system to be static, while you measure the single one you are interested in. Usually I go with credible sources of information where numerous people report results in the amounts that make them statistically relevant.



All things equal, I am skeptical about 10% difference regarding python itself. It has to be something specific to OS. As I cited, I came across the statement that Linux is much more efficient in forking the process when you use multiprocessing module. Also, it has been stated that LInux handles opening of the files more efficiently, which would explain your better result in Disk I/O.
I have ditched Winz entirely several years ago, so I don't have the reference anymore. I am firmly in the Linux camp, but nevertheless think that 10% diff in either OS favor, points to something being configured wrong in the OS.

Hi 6yaNYCjm5m,

Thanks!

Another question: Which streaming quote provider are you using? Almost all "retail grade" providers require a windows component installed and I need a pure Linux streaming quote solution. Thanks!
 
Wow, what a rare treat. Among the literally THOUSANDS of worthless posts, a gem. Congratulations. Well written, informative, and wet my appetite to work a bit with Python to get my feet wet. I run a fully systematized platform in .NET (yes that is correct), AMQP based messaging bus, capable to run a bit more than 1 million ticks on a back test basis. I have whatever library support I need, I wrap some of the older C++ libraries to use as well. I use IB's Fix Gateway and a high speed data feed (not IB's), my engine seems to not have any issues to subscribe to several thousand symbols (essentially pretty much most developed Asian cash equity markets). I heard a lot of Python, but then I have also heard a lot of how people back test in Matlab and I used it extensively but always ended up being disappointed because of inherent limitations. To be honest I believe at this point only a capable C/C++ engine is able to beat the specs my system has in terms of data processing power, both back test and real-time.

Am not writing to brag but to show all this can be had with .Net, the platform all programmers decry and belittle for raw speed. I beg to disagree. I am not a programmer by heart, I traded discretionary derivatives (including market making) for years before getting into pure quant space and pretty much everything I know about programming and what I have coded I learned all on my own. It can be done.

One thing I fully agree with you, without having used Python yet, and maybe I am way too opinionated (my nature), but without being able to vectorize I would not even give Python a try. Thanks for the long post and tips on libraries, definitely helps me to get started.

Cheers


Quote from 6yaNYCjm5m:

All the way.
Best kept secret out there, although not sure if one can call it a secret, because it's open source, most likely bunch of people just don't go around evangelically preaching about it. By the way, since living completely in the open source world for several years, I can breathe again.

Back to Python environment.
Beautiful simple syntax, easy to use, gazillion libraries. Most important, excellent performance, contrary to most uninformed opinions out there. That would be most obvious in backtesting where you would most likely face some serious volume. In Eorder to get there you have to grasp properly issues surrounding large volume processing (keyword "vectorization") and get yourself some popular libs (scipy/numpy). As a result, you have the ridiculously easy environment to code with performance approaching C levels. If you really crave terminal speed, with some expanded coding you can use Cython and get the C compiled code (not really necessary in most cases). If not the most popular environment in scientific community, it is definitely one of the most popular. Best combo of simplicity in use and performance.

Coming from the 25+ years in corporate world, I really, really can't stand what's going on within the IT these days, namely and primarily with Java crap, all those stupid frameworks and insane levels of OO-obsession. I dont' care at all about the closure, apparently it's a big deal for some people. That's exactly the thinking that derailed the whole thing after C++ became fashionable. The OO-obsession has it's place, of course, but instead of having it in moderation and in places where it is appropriate, there comes the Java and now I can not even fart without wrapping it into the class and let my kids inherit it instead of wafting it in a strightforward and efficient manner. Whatever man ...
If you try using Python native code and build some objects, inherit them 55 times, add couple of loops and you will be waiting for half an hour to sum up single symbol daily ticks. I do not suggest going through this painful discovery exercise, my comparison is similar to the database world: "normalize until hurts and then de-normalize until works", where in programming "try objects
until hurts and go back to functions until works". Don't get me started on the web crap either, somebody is going to try to show me how much I am missing in XML "programming".

I am rather grumpy these days, because I am not getting good results out of some strategies I was hoping for, hence the negative tone, but I really mean it. At the same time I do not have intention to convince anybody because I do not care, neither do I have intention to respond or argue, because I do not care. Adult people should use and do whatever they see fit.

Having said that, you've asked about the python and here are some practical steps, don't have time to put the proper links, so google is your friend (but only as a search engine):

- Although it works perfectly on Windowz, try to move yourself to Linux, you will feel much better (I just can't resist not recruiting) and every now and then there are some things that are less awkward and patchy, hence more natural in that environment. Best example is multi-processing (not talking about multi-threading) where Linux forks processes much more efficiently and Python code is almost laughably simple, unlike Java counterpart (I just can't resist).
- Python3 is well advanced, but I still use 2.6.6 and do not feel any pain.
- SciPy lib is must, or at least NumPy which comes with SciPy.
- As I said, if you crave terminal speed, get Cython.
- If you need Large Hadron Collider speed, get F2PY and use existing or write your own Fortran procedures. Hell, there is a PyCUDA library too.
- If you want/have to use database, Sqlite3 has a module that comes with standard library. If you need a big one, I recommend PostgreSQL and supporting module psycopg2.
- I do incorporate fair amount of AI and ML into my strategies and in that area SciKit_Learn covers almost everything. PyBrain is well documented and versatile, very good for development and learning, but slow and sluggish if you hit some volume (due too much Python objects usage). FANN is a very good C library with multiple Python bindings. There are also smaller, isolated modules (Kohonen SOM, some clustering, etc).
- For adhoc charting/plotting get Matplotlib (see attached sample).
- For GUI development wxWidgets is the king and wxPython is Python bindings for it (see attached sample).
- For charts within the GUI app, I recommend ChartDirector, which is the only non open source software of all mentioned, but is available for unlimited trial, and it's dirt cheap, can't believe the quality for that price. Still goes against my open source principles, I guess you can't win them all. It is one of the best libraries I have ever worked with. I am not associated with the author and I ended up only prototyping some stuff.

That will pretty much cover you from head to toe, however there is bunch of other stuff too. It's just too much to squeeze into single list. As for myself, besides proprietary stuff, I've developed some API modules that I intend (eventually) to release into public domain, but I am nasty busy right now. Also, they are not polished to perfection, but mostly in pretty good shape.

- IB TWS API client. I have struggled for a while with that horrible peace of software. The protocol design is horrendous and all ongoing patches and extensions made it into a Frankenstein creature. I have finished a working version and have parked it aside, because I will not be using IB broker for the time being.

- FIX API client. As you probably know, that's the big boys standard protocol. It is (as every other should be as well) platform agnostic and rather well conceived, however it (inevitably) sustained some silly/stupid extensions. It ended up somewhat bloated and not compatible between the parties involved, because everybody decided to pick and choose the standard parts that they are going to conform and to do custom extension for everything else. Hence it's very though to have a version that is compatible between different brokers, unless of course you shove the whole protocol in the library, in which case it's not lean and mean any more. Initially I've done the version for Deutche Bank FX retail brokerage, which is actually the FXCM white label software. After I was almost done, I bailed out when I realized (my impression at least) how amateurish is the whole department (running the whole thing over unsecured connection and wondering why I am asking). Next, I have Dukascopy FIX client and that worked pretty well, however I've decided to switch to their java client (arrrggghhh) for specific reasons. I believe this FIX client code base can be adjusted with minimal effort to work with OANDA FIX API as well as IB FIX API.

- Dukascopy Java API client. Just to be clear, my client is Python and their client/trading app is java. I had to write two "strategies" in java that are running the TCP server inside the trading app, which I connect to with my client.

All those API clients are scalable, multi-processing, using non-blocking TCP sockets, SSH included, database or binary packed files as output, and other bla,bla... As I said I do not have intention to commercialize that software and will be looking to setup development project in public domain.

There you have it, I feel much better now. Going back to cave to continue fighting my strategies. Those neural networks are not that clever as many people would like us to believe, trust me...
If you are serious with Python and need some help contact me on python at grupadinar dot com, and time permitting I will try to help.
In the spirit of the open source community, anybody else as well...
 
of course, but you still end up writing wrappers or converters for each broker or data service you interface with. That was, I think, what he (6y....) was complaining about and I wholeheartedly agree. It means unfortunately not much when a broker claims to have FIX connectivity, each version is different and requires code changes, adaptation, albeit less work than re-writing a complete API.

Quote from rosy2:

every place I have been converts all external messages to an internal (in house) format (ie. xml, json, binary something) so a market data message from IB looks like a market data message from FIX4.3 or whatever.
 
of course, but you still end up writing wrappers or converters for each broker or data service you interface with. That was, I think, what he (6y....) was complaining about and I wholeheartedly agree. It means unfortunately not much when a broker claims to have FIX connectivity, each version is different and requires code changes, adaptation, albeit less work than re-writing a complete API.

Quote from rosy2:

every place I have been converts all external messages to an internal (in house) format (ie. xml, json, binary something) so a market data message from IB looks like a market data message from FIX4.3 or whatever.
 
Quote from mcdull:

Another question: Which streaming quote provider are you using? Almost all "retail grade" providers require a windows component installed and I need a pure Linux streaming quote solution. Thanks!

There is actually excellent provider, BarChart/DDF. I have tested their feed, and it looks really good, based on the limited testing time. The instrument selection is comprehensive and documentation is pretty good, simple and straightforward, no complicated bs. You have to take a look yourself if they have all data components you are looking for, i.e. bid/ask size, bla,bla....
The feed is completely platform agnostic, historical quotes you get from web service (I guess that's the fancy name) and real-time through the plain tcp connection, which you can btw test interactively using the telnet. You can call and ask for couple of weeks of demo, they were very accommodating. This is the guy I was talking to:
Mark Wator
Barchart.com, Inc.
Business Development & Sales
(312) 506-8729

In order to help you get started I am attaching 2 test scripts, historical quotes client test and real-time quotes client test. It should help you with some basic record structure as well. The code is geared toward testing/example/understanding rather than efficiency or production deployment. With historical quotes, you are effectively done in 2 lines of code, while real-time requires 4 steps before you hit the rt stream (connect, login, set version, request symbol). My tests were done in April,2011, I hope they didn't change anything substantial.

Lately my focus is entirely on fx, so I have to go with the broker feed, but when I go back to futures those guys will be my first choice for data.

Hope this helps,
Dr. Sheldon Cooper
 

Attachments

Quote from 6yaNYCjm5m:

There is actually excellent provider, BarChart/DDF. I have tested their feed, and it looks really good, based on the limited testing time. The instrument selection is comprehensive and documentation is pretty good, simple and straightforward, no complicated bs. You have to take a look yourself if they have all data components you are looking for, i.e. bid/ask size, bla,bla....
The feed is completely platform agnostic, historical quotes you get from web service (I guess that's the fancy name) and real-time through the plain tcp connection, which you can btw test interactively using the telnet. You can call and ask for couple of weeks of demo, they were very accommodating. This is the guy I was talking to:
Mark Wator
Barchart.com, Inc.
Business Development & Sales
(312) 506-8729

In order to help you get started I am attaching 2 test scripts, historical quotes client test and real-time quotes client test. It should help you with some basic record structure as well. The code is geared toward testing/example/understanding rather than efficiency or production deployment. With historical quotes, you are effectively done in 2 lines of code, while real-time requires 4 steps before you hit the rt stream (connect, login, set version, request symbol). My tests were done in April,2011, I hope they didn't change anything substantial.

Lately my focus is entirely on fx, so I have to go with the broker feed, but when I go back to futures those guys will be my first choice for data.

Hope this helps,
Dr. Sheldon Cooper

Thanks, Dr. Cooper. Really appreciate your help.

I seldom heard people build everything in Python because everyone says it is too slow. I thought I was alone! In fact, it is fast enough for day trading but I guess nobody uses it for trading sub-millisecond timeframe.

I code almost everything in Python except one component which is talking to a stupid windows application for retrieving streaming quotes. My system is still in paper trade mode and it does 8-10 trades daily.
 
Hey 6yaNYCjm5m,

Thanks for your posts, it's nice to see somebody using Python. I've built a simple charting app which display OHLCV values on the screen using ChartDirector too.
Does your app allow you to draw trendline? That's something I'd like to do but not sure how to go around it yet. To be more precise, I'd like to draw a line, click on it, move it, etc... I'd like to add comment to the chart and store it somewhere, and display it again when I'm viewing the same timeframe at a later stage.
Another thing I'd like to do, is to highlight a bar when I mouse over it. As ChartDir is creating the whole image at once, I'm not sure if this is do-able.

Regards,

Christophe
 
Quote from mcdull:

Thanks, Dr. Cooper. Really appreciate your help.

I seldom heard people build everything in Python because everyone says it is too slow. I thought I was alone! In fact, it is fast enough for day trading but I guess nobody uses it for trading sub-millisecond timeframe.

I code almost everything in Python except one component which is talking to a stupid windows application for retrieving streaming quotes. My system is still in paper trade mode and it does 8-10 trades daily.

Python can be slow. If speed is a requirement, check out Cython. It is absolutely incredible.
 
Quote from chromosome:

Python can be slow. If speed is a requirement, check out Cython. It is absolutely incredible.


Yes, I agree. :)
Also, python is not a compiled language, many errors won't be found until actual running of code. But given the capability of quick testing of idea, I use python for ATS.

C/C++ is my mother tongue, Python is my secondary language. :)
 
for those doing batch backtests this service is great. It has almost all the python modules you need and those that are not there you can create a custom environment and install them (or anything else) you need.

https://www.picloud.com
 
Back
Top