Welcome to NexusFi: the best trading community on the planet, with over 150,000 members Sign Up Now for Free
Genuine reviews from real traders, not fake reviews from stealth vendors
Quality education from leading professional traders
We are a friendly, helpful, and positive community
We do not tolerate rude behavior, trolling, or vendors advertising in posts
We are here to help, just let us know what you need
You'll need to register in order to view the content of the threads and start contributing to our community. It's free for basic access, or support us by becoming an Elite Member -- see if you qualify for a discount below.
-- Big Mike, Site Administrator
(If you already have an account, login at the top of the page)
During development, when do you decide to quit or keep coding a Strategy? See Equity
again I would like to hear some opinions about the strategy development process.
I started with an idea and I chose the 2016-2017 years to code and test the first raw strategy for the FDAX with multichart as usual. I left the 2018 year beacuse I want untouched data just in case I'm going to perform a Walk forward optimization and I need out of the sample data.
Ok the idea was built over bollinger bands using an unusual timeframe of 90 minutes over a regular bar chart. In 2016/17 years it performed very well in the backtest so I tried to add more years going backwords...So I added 2014/2015 and the system was still "good enough" to keep developing. Then I tried on the fly to add another 3 years 2011/13 and in thos years the results were horrible.
So here is the equity line
Please keep in mind that I've just started to develope this strategy and after a day and half of code and test I came up with this "middle development" result. There's no way by just optimazing the strategy to fix those horrible 3 years.. but from 2014 to the end of 2017 it looks promising.
I'm sure that most of you (automatic system traders) have seen a lot of situations like this. I'm curious to know when you reach the moment to say " ok I'll abandon this idea "?
Or, in case you like too much those 2014-2017 period, would you spend more time on coding by RISKING to overfit the system to look good during all the years?
I'm intested to know some workflow suggestion for CASES LIKE THIS from experts.
If you can, view the data from 2011-2013 on a chart. It's possible that the data is just incomplete. Sometimes the further you go back the less reliable the historical data becomes. You may find periods where multiple candles are missing.
Most systems will go through periods where it just doesn't perform well. Markets change but the system doesn't. It is an unfortunate thing with regards to automated systems. You can either attempt to ride through the rough patches or figure out a way to recognize when it is happening and then either switch strategies or just switch it off and trade manually. Although I think it is possible to have a strategy that is flexible enough to adjust to changing times I also think that to get there is a long and tough journey. I have yet to create one that is adaptive enough to handle the dynamics of trading. Doesn't mean I am giving up.
I only stop working with a strategy when I can see that it has a high frequency of bad trades.
The way I do things is different than the OP described, but basically it is almost never a good idea to take a system tested on all the data and then try to make it better, or try to engineer out the bad trades/days/months/years. You'll end up with better backtest results, for sure, but that does not translate (usually) to better live results.
Personally, if I had a walkforward equity that looked like that (I believe some/most of yours is optimized, so we are talking apples and oranges here) I would evaluate the curve as is (no rule changes) and if it passed my criteria, I'd go to my next step. If it failed because of that drawdown period, I'd move on to the next strategy idea.
There are 3 basic ways to approach a question like this (1) a test, i.e. reject the null-hypothesis, (2) data discovery, i.e. studying the data provides insight, and (3) model driven, i.e. causal model. I might recommend looking into Judea Pearl's, "The Book of Why" or perhaps watch this video of a lecture he gave as a way of starting to think about some of these sorts of questions. There is no simple answer to these sorts of problems.
However, a more practical set of thoughts might be to consider the reason certain developers follow certain methodologies and their underlying assumptions and when they might work or might not work:
Backtest over a long history. Why? Because, if a system works over a long period of time then it suggest you may have found a persistent or stable edge.
Validate with walk-forward optimization. Why? Because, if your system can adapt to changing dynamics then it suggest the general principle behind the system may be valid, a form of generalization, even if the specifics change.
Backtest over the most recent data only. Why? Because it suggest you may be able to take advantage of current dynamics or unfolding uncertainties. The objective is to take advantage of the current market dynamics. You are not seeking to find a stable or persistent edge but take advantage of dynamic unfolding situation. Ask what changed.
Use a 3 split approach. A reasonable compromise approach is to use a 3 split data-set. One part you are allowed to optimize on and test against. The second split is your active validation. You evaluate based on the test sample but do not optimize over it. The third split is your final verification or your hold out which you do not look at.
I honestly think your general approach is natural but flawed. It is a natural thing to do, and I am sure I have did such things myself working on systems. But, unless you setup some requirements in advance or unless you are developing seeking to learn from the data then it does look more like you might be playing mental games vs. actually testing an idea. As an aside, I do not always agree with the "data blind" approach. I think data mining approach has value when approached with proper orientation. Basically, you have to understand the orientation or methodology that you are following as to why or how it makes sense or not. If you are trying to understand characteristics of the data that will help you discover an edge, this exploratory approach makes more sense. On the other hand, if you are trying to test an idea in the form a null hypothesis-rejection then you need to setup some requirements up front. And, that is what I mean by playing games because you could go back another few years and if it did well then you might change your mind again or if did poorly you might reject it but in any case.
Well, your third split is your validation split. Right, the 3 split approach (train, test, validate) is useful because it allows you to learn from the data while still holding out a split to minimize data leakage. So, it is a compromise approach that allows you to learn from the data-- which I think has a lot of value while still having an out-of-sample "ground truth" set.
Example, you build your strategy and optimize it over your training set (1). And you then verify against your test set (2). It looks pretty good but you learn if you had steep losses on Monday. So, you add a rule to prevent trading on Monday. Now, everything looks good but there isn't a ground truth as to whether your results generalize or will hold. So, your final "ground truth" verification is your final verification set (3).
But keep in mind all of these are forms of games which probably hinge as much on the quality of the original idea (or from the quality of the insights gained from data exploration, if using that approach) and the actual value, of any method, is in the results.
To answer the original question, I use backtesting mostly for debugging code and for some optimization. I'm a proponent of out-of-sample tests, but the optimizations have to be logical as well.
I put a lot of emphasis on small-volume live testing, and will discard a losing robot at that point. It costs me money, but that's the only way I know how to estimate slippage. Of course, if optimizations won't yield paper profits, I wouldn't venture a live test.