Large sorting in R

I tend to agree. Despite the often furious attacks by R package developers, claiming R can handle anything under the sun and RCCP C++ integration, fact remains that R is not a suitable environment to test data sets worth many millions of data points in a computationally efficient way. Such developers often show off benchmarks that show how they can generate a 100 million element time series or run a single series. What they conveniently omit is that testing a full-fledged trading strategy often involves 20-100 different method invocations and callbacks per single tick/quote/bar iterated. R is great to investigate limited data series and/or to extract a single metric but not much more. I have not seen a single R solution that can back test a full fledged trading strategy in an efficient manner.

The crux is that R is slow at everything except vectorized operations. So, if you're modeling any kind of backtest on an event-based/looping architecture, it's not the correct way to approach the problem in R. That said, vectorizing everything is definitely not convenient. In my opinion, if you're using R for anything other than an exploratory environment, chances are you know what you're doing and can offload any heavy computation to C++ (or pick your lang here), or you're just going to be miserable.
 
Absolutely agree. But you should be careful, some of the R core crowd are extremely hostile and aggressive to the slightest criticism of R. You should read some of the discussions that arose about a comparison between Python and R at Stackexchange's Quant finance forum.

The crux is that R is slow at everything except vectorized operations. So, if you're modeling any kind of backtest on an event-based/looping architecture, it's not the correct way to approach the problem in R. That said, vectorizing everything is definitely not convenient. In my opinion, if you're using R for anything other than an exploratory environment, chances are you know what you're doing and can offload any heavy computation to C++ (or pick your lang here), or you're just going to be miserable.
 
The crux is that R is slow at everything except vectorized operations. So, if you're modeling any kind of backtest on an event-based/looping architecture, it's not the correct way to approach the problem in R.

Yes, good point. I came to the conclusion that to properly apply R I need to figure out how to vectorize my signal generator functions. This is the next thing I will be looking into.
 
Indeed. I was not suggesting to post language comparison questions there...not sure why you brought up this issue. I pointed out a place where there are plenty R developers and they apparently do not take criticism of R lightly. Almost in a cute way.

"What programming language should I use..." questions are explicitly discouraged at SE... but for better or worse ET's rules are looser.
 
Back
Top