pandas too slow for event driven backtesting

Development speed and massive choice in modules. There's a module for everything I can think of. Pandas and numpy only use python as a frontend, backend is C/C++.
Development speed applies if you are a Python dev and know the language, because you are used to those modules and know where they are. If you have to learn the language from scratch that development speed does not apply.
I agree that is an useful language for many things, specially for data science. But for trading it is just a blocker, I've seen some many posts from people trying to improve their code performance, and so many replies are of the form of: "you can't do that with Python" "for that use c++". Why don't they use c++ in the first place? There are plenty of libraries available for anything they want to do. Back in the day it was a pain to setup anything in c++, and I think this is the reason why people think the language is difficult, but nowadays with modern IDES it is all done for you. It is way easier to develop and debug code.
 
yes, I think numpy ndarray is the correct direction.
I have timed the speed of numpy and pandas indexing. For numpy, indexing takes at the level of 100 nanoseconds to 1microsecond but for pandas the labeled indexing method .loc[] could take 100 microseconds to up to 10 milliseconds. This is about 1000 times faster then pandas indexing!!
But I am still not sure how to switch to numpy ndarray, as it only supports integer location based indexing. So if I want to performed labeled based indexing, like indexing certain time periods and certain assets as I did in pandas using .loc[]. Not sure how to achieve this goal in numpy.

You don't need to build a huge pandas df for this. Use what you're operating on. Likely a GUI is preferable in the long-run. Build the df out of instruments you need, it doesn't need to be multiindex either for this.
 
Development speed applies if you are a Python dev and know the language, because you are used to those modules and know where they are. If you have to learn the language from scratch that development speed does not apply.
I agree that is an useful language for many things, specially for data science. But for trading it is just a blocker, I've seen some many posts from people trying to improve their code performance, and so many replies are of the form of: "you can't do that with Python" "for that use c++". Why don't they use c++ in the first place? There are plenty of libraries available for anything they want to do. Back in the day it was a pain to setup anything in c++, and I think this is the reason why people think the language is difficult, but nowadays with modern IDES it is all done for you. It is way easier to develop and debug code.

For most trading Python is plenty fast. Does C++ really have such an universe of modules? From what I know Python has the widest selection. Anything I can imagine, there's already a module that does it better - massive time savings.
For example the basics, having market calendars for most stock / futures markets? Does it exist in C++?
 
For most trading Python is plenty fast. Does C++ really have such an universe of modules? From what I know Python has the widest selection. Anything I can imagine, there's already a module that does it better - massive time savings.
For example the basics, having market calendars for most stock / futures markets? Does it exist in C++?

I would be surprised if there isn't someone that has implemented a library already for that. That's not really part of the language, what you are describing is a module that someone has implemented as a helper. If you were used to the c++ community you would find that module as easy as you do with Python.
For dates availability I normally use the endpoints provided by brokers. I just need to get the market details and I will know when it is available. I don't see a big deal here.
 
Development speed applies if you are a Python dev and know the language, because you are used to those modules and know where they are. If you have to learn the language from scratch that development speed does not apply.
I agree that is an useful language for many things, specially for data science. But for trading it is just a blocker, I've seen some many posts from people trying to improve their code performance, and so many replies are of the form of: "you can't do that with Python" "for that use c++". Why don't they use c++ in the first place? There are plenty of libraries available for anything they want to do. Back in the day it was a pain to setup anything in c++, and I think this is the reason why people think the language is difficult, but nowadays with modern IDES it is all done for you. It is way easier to develop and debug code.
apples and oranges. The OP is about back testing not trading
 
For example the basics, having market calendars for most stock / futures markets? Does it exist in C++?

Yes.
https://rkapl123.github.io/QLAnnotatedSource/da/d3e/class_quant_lib_1_1_calendar.html
calendar class

This class provides methods for determining whether a date is a business day or a holiday for a given market, and for incrementing/decrementing a date of a given number of business days.

However, speaking as a long-time C++ programmer, python is probably a better choice for most people because it's very easy to mess things up with poorly-written C++ code.
 
  • Like
Reactions: d08
Back
Top