I'm currently trying to decide which platform to use for both strategy testing and execution. Ilya does excellent job evangelizing usage of R for research but I cannot find any information about usage of R for strategy execution.
Looks like Python is used for both. Since I have to learn anyway I could go either way but I would rather not learn both.
I would appreciate any thoughts or pointers.
Btw, at this point I'm thinking about swing trading with EOD data across a large trade population (Russell 3000) so speed is not as critical as for tick driven trading.
I'm a Linux guy and don't want to get into a vendor tool such as Amibroker
IMHO Python is easier to learn, and if speed is a problem you can easily prototype in Python and code the piece of code which needs to be as fast as possible in C or C++.
I found R harder to learn and use, but everyone is different and I'm biased by my long years coding in C and its cousins.
Usually in trading, those who know don't talk, and those who talk don't know. (Al Brooks)
success requires no deodorant! (Sun Tzu)
The following 2 users say Thank You to sam028 for this post:
R has librarys to attach to Interactive Brokers as well as send FIX messages. This will allow you to execute automatically. Making it end to end. Also you can use RCpp and basically connect to any broker or exchange API.
Python has the similar capabilities as well. However the financial and econometric ecosystem is very immature compared to R. But if you want to home roll your own its no problem. As well as base execution speed is faster, however R becomes faster with the use of RCpp, even vs Python Pypy. See my previous post link.
Its really up to you since they are very comparable. I feel that R is more geared to people who think more mathematically, and like to do more complex statistical data analysis with less LOC. As the syntax is very close to how you would right actual expressions and equations by hand. But for people who are more programmers by back ground may find this syntax confusing. However using R for more generalized programming methods will be cumbersome.
You do not need to learn both just pick what is more important to you. Data analysis, back testing, financial and econometric analysis then use R. If faster base processing, traditional program syntax friendly, and better generalized programming usage then use Python.
The following 2 users say Thank You to treydog999 for this post:
If writing very optimized programs is not a priority for you and you're a one-man shop, then I would strongly recommend working with an expressive language such as OCaml and not work with all that baggage in R/Python/MATLAB.
Pretty much the entirety of the OCaml language in Backus-Naur form:
Most people prefer MATLAB or R for their read-eval-print-loop environments. OCaml has utop for REPL but nevertheless comes with compilers and production quality tools, e.g. ocamldep (dependency generator), ocamlopt (optimizing native code compiler), ocamldebug (debugger).
The following 3 users say Thank You to artemiso for this post:
Very good summary!
How about supported types of backtesting and flexibility to extend these:
- which one supports walk-foward testing? I've been told QuantStrat has a somewhat limitted WFA capabilities.
- how flexible is it to build on top of an existing R / Python module?
The problem with Python's zipline engine is - it has been built for a web interface, and it barely does a backtest iteration. Make it do simple optimizations on multiple parameters, and you'll grow old waiting for the <zip> finish line. And I fully agree with Python being best for <faster base processing, traditional program syntax friendly, and better generalized programming>.
However, should we also vote which goals / problems we try to fix are more important?
For me, FAST idea research, optimization and results analysis are more important than flexible interactive graphs, integration with the 1 thousand language flavors are out there in the wild and IDE Perfection (through Visual Studio) - which is always good to have but would be the second priority.
So, the pool is great but relative to what? Looking at the pros/cons listed on the two main competitors, it might show two kinds of dominant voters:
- the ones that voted having as a priority - the (trading related) problems we're trying to solve
- the ones that voted out of their passion for one or another without that much consideration on the problems to be solved
I think quantstrats WFA is fine. Its not great but workable. I have used it myself on several occasions and it does support multi core processing which is definitely a nice feature given R is usually single threaded.
I looked at zipline one time and saw immediately it did not fit my needs. It is equity focused so if you require anything that has a multiplier (futures, options, swaps or basically any derivative your screwed). So I never even tried it.
If you do not feel like building your own backtesting infrastructure I would say that quantstrat would be the way to go. It is the closest thing to a professional framework you can get in either language. It can handle cash and derivative products, it can do WFA and optimization out of the box. Also the structure makes sense by breaking things down into indicator, signal, rule, etc.
But I found limitations with quantstrat, mostly when you are designing portfolios of strategies. Or using advanced capital aware position sizing or volatility adjustment for your portfolio. This was a wall I could not find my way around. But most people probably will not focus on this area much, until they have several profitable strategies. Also quantstrat is meant to be used for tick data. so there are a few artifacts if you only have OHLC data. This means that some calculations on trailing stops and other orders may come out different than expected.
Overall I ended up designing my own framework in python. Why did i choose python after being such a staunch R supporter? Well 2 reasons first of all its easier to find developers for python than R, as I may need to hire people in the future. Second, translating from R to C++ is a pain in the ass. Seriously takes massive amounts of time for the IT team to translate the logic, debug and simulate to make sure we get identical results. This time savings is immense but also now the IT teams and the quant strategists now can interface and "speak" the same programming language. So coding logic, ideas and paradigms can all be easily shared cross department. Even if the end result will be python, c++ or a mix of the 2. The basic logic and structure and ideas can all be blocked out using python code and see it work at that level, then go into C++ if it requires API access or optimization on speed. Or just refactor the python and job done.
For us right now having that added latency is not that big of a deal as our time horizon for trades is fairly long. The additional slippage or latency is not that important. Also since we can interface c++ much easier into python than into R although both languages do support it. Our execution API is C++, which is why i mention it.
it does point again to python as a developer will be more familiar with the syntax of python vs R. But for me personally i like R syntax better because I think more like a mathematician than a coder. But I am not the only person who is working on this project. So in the end it was better to switch.
The following 5 users say Thank You to treydog999 for this post: