An experiment on curve fitting

May 29th, 2010, 11:29 AM

Hypothesis:
An automated strategy can be profitable if curve-fitted to recent market conditions.

Prelude:
Early automated trading I had done in Sept to Oct 2009 gave me profitable results with weekly optimizations of a strategy I was running with real money. The actual results were break-even to slightly unprofitable, but some trades were the fault of human error in operating and administering the strategy, or technical glitches, and not a fault of the strategy itself and, without those problems, would have been profitable.

I stopped trading algorithmically at the time though because I realized there were a lot of dynamics about markets that I wasn't considering and naively ignoring such as best time to trade, multiple time frame analysis, better money management, etc. and that I should become a better discretionary trader before trying to automate anything. So I took a break from trading automatically with real money for a while and traded manually with real money for about 6 months, experimenting with various setups, with varying degrees of success, resulting in an overall loss. Sometimes the losses were due to lack of discipline, lack of confidence in my trading plan, psychological challenges, or poor execution in a fast moving market (CL!) or a combination of all of the above. After 6 months of real money trading, I decided to re-focus my efforts on automated trading so I can develop a trading plan with setups and risk/reward expectancies that I can be confident in and that can be executed automatically without my fat fingers, psychological disorders and sleeping habits getting in the way.

Purpose:
The debate about whether this can be done, or whether or not one should trade automatically rages in other threads on nexusfi.com (formerly BMT), so I won't inflame that debate here. My purpose here is to just show some research about curve fitting and see if there is any place where curve fitting has a useful place in the trader's tool belt. "Curve fitting" is usually depicted as bad, evil, and perilous to one's account. My purpose is to solidify or weaken that perception with a simple case study.

May 29th, 2010, 12:08 PM

Experiment:
I have an engulfing reversal strategy that I wrote back in December 2009 that has provided some interesting but inconsistent results. I like reversal strategies because they perform well with expectancies larger than your risks. And I've been back testing and optimizing this thing off and on for a while and for each period I test I get different results with different settings, so I'm prone to throw it away. However, I thought that if I could curve-fit it to recent market conditions on a weekly basis that I could maybe find some value with it.

Optimization Scenarios:
One important thing to note is that the parameters that I curve fitted have nothing do with entry signals, only in determining risk and reward and whether or not there is enough volatility to trade.

The strategy is using price action and there is an EMA(20) hard-coded. The reversal has to happen below the EMA for a long and above the EMA for a short. There's a 15-bar look back period for determining HH's and LL's but that is also not optimized. All optimizations and backtests are on ES on a 5min chart from 8:00am EST (5am my time) to 3:30pm EST (12:30pm my time) for deciding whether to enter a trade or not.

The results of back tests are fairly realistic because I'm using minute bars, signaling on bar close, and commissions are included. In reality, I may get a little slippage here or there, but the ES is so liquid the amount of slippage is assumed to be negligible.

Optimization and Backtesting Methodology:
The approach I will use to curve fitting is to optimize on the last 3 weeks of trading to determine how I should trade the next week. The "next week" trading is accomplished with a backtest on that week using the optimized results from the 3 previous weeks. The optimizations are performed for net profit. There are only 3 parameters which are being optimized:

Min ATR - The minimum size of ATR(5) to take a trade. And setups that occur with an ATR(5) smaller than this number will be ignored.

ATR Profit Factor and ATR Stop Factor - These are factors to be used with ATR to determine how large my profit target and stop loss levels should be relative to the current volatility.

Let apf = ATR Profit Factor and asf = ATR Stop Factor

So, my profit target of a trade is apf * ATR(5) and my stop loss level is asf * ATR(5).

The strategy requires that apf > 2 * asf, otherwise it doesn't even consider trading. So this means that any optimzation scenario where apf <= 2* asf is ignored. Unfortunately, it still goes through each bar and decides not to trade, wasting time. I've addressed this weakness with Ninjatrader and hope they can add a feature where during optimization we can say "don't trade with these parameter settings, go to the next setting" and avoid going though each bar of the test not trading, but I digress.

More Strategy Details:
Stops are tightened at various levels of profitability using a custom trailing stop technique, eventually reaching break even at some point. Lines are drawn on the chart for the target and the trailing stop to help me debug my target calculations and money management adjustments.

Because the strategy has a trailing stop, and a large expectancy relative to risk, it can still be profitable with win rates of 20-30% if the trader can psychologically handle being wrong 80% of the time.

May 29th, 2010, 12:21 PM

Results:
I picked the current ES contract (ES-06-10) and started my 3 week window to the first week the contract trades the predominant volume. This is a fairly small sample size but illustrative nonetheless.

Columns A-B define the 3 week optimization range. Columns C-E are the optimal settings of the 3 parameters we optimize on. Columns F-I are more data about the optimized results.

A backtest is then performed for the following week, defined in columns K-L, using the optimized parameters in C-E. The results are in columns M-P.

As you can see, of the 8 weeks of trading like this only 1 week was profitable. Actual trading results varied miserably from the optimized results.

The window from 4/25 to 5/15 produced 0 trades when I optimized the entire 3 week period for some reason, but optimizing just the 1st week of that 3 week period did product trades, so I just used that first week's data for the back test. I think it has something to do with ZenFire's handling of the "flash crash" on May 6th and/or I may have some gaps in my historical data that Ninjatrader's strategy analyzer can't handle. I'm not sure what's happening there, so I highlighted the results in red as a potential outlier to discard.

May 29th, 2010, 12:26 PM

Conclusions:
While the sample size is small and by no means definitively conclusive, this example solidifies the widely held belief that curve fitting is bad. Proponents of curve fitting could say that my approach was wrong, or that it's a poor trading strategy. Due to curiosity I'm prone to continue trying to find ways where curve fitting can work so I will continue this effort here in this thread.

Action Items:
Improve the strategy: The strategy just uses the 1 time frame, which I believe is a weakness. Further experiments should add at least higher time frame and only consider taking reversals in the direction of the higher time frame trend, which I believe would provide better results. Also, more aggressive money management techniques cold be used to raise stops more aggressively as levels of profit are reached instead of only working towards break-even. Also, different ways of determining volatility-based risk/reward measurements can be used.

Use more sophisticated curve fitting approaches: Instead of a simple 3 week window, we could, for example, take a weighted average of settings from the last 2 weeks, the 4 weeks prior to that, and the 12 weeks prior to that, giving more weight to the more recent weeks but not totally ignoring performance over longer historical periods.

Different Optimization Techniques: We may be better off optimizing on profit factor instead of net profit. Using settings that provide better risk/rewards often result in fewer trades but it also provides a less volatile portfolio if the profit factor stays in tact. In the past I've noticed going for larger profit factor setups also results in fewer trades and can be undesirable to a trader that is impatient and wants to trade more frequently.

Quoting

I have not failed. I've just found 10,000 ways that won't work. - Thomas Edison

May 29th, 2010, 01:39 PM

Hi shodson,

first of all, thanks for bringing the subject up. I have not traded automated systems and I do not intend to do so during the next year, as this is much more demanding than discretionary trading and requires a larger variety of skills.

However, the question of curve fitting also applies to the discretionary trader. After all, even as a discretionary trader I follow a method that supposedly provides an edge in the markets. To be sure that this edge exists, I need to backtest this method over a large number of trades.

Overfitting a curve describes the risk

(1) that you are optimizing your trading system by taking a sample which is too small
(2) that the conditions that prevailed for your in-sample period will change in future
(3) that you are using too many parameters to tweak the results
(4) that the parameters used are not independent from each other

These few points already allow to draw a number of conclusions:

If you develop a system which uses daily data, and you want to have a number of in-sample trades which is statistically significant, you will need quite a long in-sample period, and it becomes likely that the market conditions have changed in a way that the test results will not be predictive of any future results. So for daily trading, the points (1) and (2) are somehow contradictive. To cope with this problem, a method called walk-forward analysis cuts the sample period into bits and pieces to use the same data several times, both as in-sample and out-of-sample data. I am not really convinced that this solves the underlying problem. Also I do not think that you could easily trade a breakout system like that of the Turtles today.

For systems that use intraday data and that use smaller timeframes to trade, it is technically easier to backtest ideas. If I have a system that trades 5-minute bars, and I want a lookback period of 5000 bars, this requires less than one month of data. And yes, I do believe that most of the species of feedback loops that could be observed in April can still be found in May. However for daily data, I would need 20 years of data to get my 5000 bars for the backtest. And no, I do not believe that market conditions back in 1990 were the same as today. So I agree that some curve-fitting might be beneficial, as long as the circumstances that prevailed during the in-sample period are not likely to change for the production period of the system.

One key is human behaviour. There is a natural rhythm or market activity that repeats itself every day, pre-market news -> market open -> initial balance -> late morning -> noon session -> afternoon session -> approaching the close. Each of these periods has its own logic and function, so an intraday system (with the exception of HFT) would need to take this into account. If curve-fitting means translating such repetitive behavior into a period of a moving average, then curve-fitting will work as long as humans take a break at noon.

The second key is independence of the parameters. Most of the systems I have seen use functions f(1), f(2), f(3) all derived from price and then try to optimize parameters for those functions. This is like a dog that tries to catch its tail, it simply cannot work. However, if you look at

- price
- time as a second dimension of price
- volume
- market breadth
- intermarket correlations

there is information independent from price. I would think that a system which tweaks 4 parameters for different inputs is more reliable than a system that only uses price and then optimizes 4 parameters for price based indicators. Also note that too-simple relationships do not exist:

- Expanding price range may provoke continuation (breakaway or continuation gap) or reversal
- High volume may be breakout or stopping volume
- Intermarket correlation can change from positive to negative and back

so you would need additional filters such as a fear index (flight to safety) or a time-of-day filter allowing to adjust volatility and volume.

So in my opinion, yes, you need some curve fitting, but

- you need to fit the curve to a substance that will not evaporate before it can be exploited
- and you need to apply additional filters that qualify market conditions for the in-sample, out-of-sample and production period of the system

Edit: Written as a response to #1, before reading #3, #4 and #5

May 31st, 2010, 06:34 AM

The book "Evidence Based Technical Analysis" has some great methodology describing how to backtest properly and a lot of detail on data mining vs. curve fitting, and also a lot of detail on in sample and out of sample testing.

I always use a walk-forward analysis in strategies I write, and I try to initially exclude a good amount (months) of out of sample data. The reason is, let's say a strategy seems to preform well, and then I decide to do a walk-forward analysis. In the first run, I may have excluded the last three months of data (let's call it March, April and May 2010).

The first run I may do an extensive walk-forward and then for the OOS test I will include March 2010. If the results are bad, and I make further changes to the strategy, I can no longer designate March 2010 as out of sample data. Regardless of whether or not in the strategy backtest I tell MultiCharts that March 2010 is to be treated as out of sample, the fact is that it isn't any longer.

So the second strategy test would need to be tested again but this time using April 2010 as the out of sample data. Again, if I decide to make changes to the strategy, I can no longer compare the before and after of those changes against the April 2010 results. I cannot say, ok, after these changes the strategy is now performing better -- because, in fact, the April 2010 results are now in sample. My brain has placed them in sample, because I am using April 2010 to make a determination if the strategy is better or worse after the latest change.

So then on the third run, the third tweak of the strategy, I now include May 2010 which is the last month I have from my original "held back" out of sample data.

I've found this to be the best approach. I generally backtest for 2 years using tick level data. I find holding back the most recent three months for OOS works well in this manner. But, I also find that sometimes the strategy ends up not performing as expected and I exhaust my three months of OOS data and have no more OOS data to try, except for a live market, which can take months to test against.

It is for the above reason that I am strongly against re-optimizing or re-curve fitting a strategy after the fact (after the strategy is "live"), because you are optimizing it to try to improve the out of sample data, but in reality all you are doing is including more and more in sample data.

Mike

An experiment on curve fitting

Discussion in Traders Hideout

An experiment on curve fitting