optimization/overfitting and mess with degrees of freedom - futures io
futures io



optimization/overfitting and mess with degrees of freedom


Discussion in Traders Hideout

Updated
    1. trending_up 5,475 views
    2. thumb_up 2 thanks given
    3. group 2 followers
    1. forum 4 posts
    2. attach_file 2 attachments




Welcome to futures io: the largest futures trading community on the planet, with well over 125,000 members
  • Genuine reviews from real traders, not fake reviews from stealth vendors
  • Quality education from leading professional traders
  • We are a friendly, helpful, and positive community
  • We do not tolerate rude behavior, trolling, or vendors advertising in posts
  • We are here to help, just let us know what you need
You'll need to register in order to view the content of the threads and start contributing to our community.  It's free and simple.

-- Big Mike, Site Administrator

(If you already have an account, login at the top of the page)

 
Search this Thread
 

optimization/overfitting and mess with degrees of freedom

(login for full post details)
  #1 (permalink)
krzysiaczek99
stockholm sweden
 
 
Posts: 9 since Sep 2012
Thanks: 1 given, 2 received

The purpose of this topic is to establish some facts about degrees of freedom and its impact on overfitting.

The 1st definition which i found is in Pardo book 'The Evaluation and Optimization of Trading Strategies'

Quite different definitions are in MSA and Adaptrade Builder help, another tools which are dealing with optimization and overfitting

So the main questions is rising: what is a correct approach ????

In case of Adaptrade Builder/MSA as the approach is based on number of trades and rules immediate questions are coming to my mind:

1) what is the exact proportion which should guarantee no overfitting
2) its easy to manipulate with number of trades to inflate the results e.g for crossover strategy when the trades open to generate the
trades on every bar till opposite crossover takes place
3) definition also uses number of inputs. Is input with e.g. 10 values (range 1-10, step 1) equal with input with 100000 values ??
4) what is the origin of this type of approach ??

Hopefully somebody can answer this. Similar thread i'm opening on the adaptrade builder google group hoping to get some answers from Adaptrade/MSA author (M. Bryant)

Regards, Krzysztof

expert from Pardo book

Consider degrees of freedom as the simulation sample size adjusted
for the number of conditions and rules placed upon it. The simulation test
space is reduced in proportion to the number of degrees of freedom that
are consumed by the rules and variables of the trading strategy.

Consider the following simple example as an illustration. Assume a
trading strategy uses a single moving average with a length of 30 days and
a test window size of 100 days. It is said that one degree of freedom is consumed
by each data point that is used in its calculation. Our 30-day moving
average then uses 30 degrees of freedom. To understand the relevance of
this in a more practical way, we determine the remaining degrees of freedom
according to the following formula.
DF = Degrees of Freedom
Udf = Used Degrees of Freedom
Odf = Original Degrees of Freedom
Rdf% = Remaining Percentage of Degrees of Freedom
Rdf = 100 × [1 − (Udf/Odf)]
Rdf = 100 × [1 − (30/100)]
Rdf = 70%
The degrees of freedom left in our example are 70 percent, which
is not that attractive. As a rule of thumb, we would prefer that the
remaining degrees of freedom exceed 90 percent. Therefore, there is no
point in performing this simulation as stipulated.
On the other hand, consider a trading system that again uses a single
moving average with a period of 10 days and a test window size of
1,000 days.
Rdf % = 100 × [1 - (10/1,000)]
Rdf = 99%
This simulation can proceed with 99 percent degrees of freedom. This
will produce a more statistically reliable historical simulation.
Consider two other applications of this principle. In the first, a trading
strategy has 100 rules and a simulation with one hundred 100 days of data
is considered. Applying our formula, we see that this leaves us with no
degrees of freedom. It is easy to see that this test is absurd. In contrast,
consider a simulation of a trading strategy with 1 rule and 100 days of data.
Even though it is a small data sample, the 99 percent degrees of freedom is
acceptable.

THE CAUSES OF OVERFITTING
Overfitting is a direct result of the violation of some or all of the rules
of evaluation and optimization. These violations generally fall into five
categories:
1. Insufficient degrees of freedom
2. Inadequate data and trade sample
3. Incorrect optimization methods
4. A big win in a small trade sample
5. Absence of a Walk-Forward Analysis
Degrees of Freedom
It is a cardinal rule of statistical analysis that too many constraints—or
too few degrees of freedom—on a data sample will lead to untrustworthy
results. In other words, if the calculation of the formulae of a trading strategy
consumes too large a proportion of the data sample, the results of the
optimization will lack sufficient statistical validity and hence become unreliable.
Degrees of freedom and sample size are inextricably intertwined.
Insufficient degrees of freedom are still a major cause of overfitting.
To a large extent, degrees of freedom are simply a way to determine
whether there are enough data remaining to produce a valid trade sample
after all deductions have been made for the price data that are used to
calculate the trading rules, indicators, and so forth.
Measuring Degrees of Freedom
It is simple to measure degrees of freedom. To begin, it can be thought that
each data point in the sample represents “one degree of freedom.” If the
sample size is one thousand data points, then it begins with one thousand
degrees of freedom, that is, all of the data are unconstrained.
A degree of freedom then is said to be consumed or used by each trading
rule and by every data point necessary to calculate indicators.
To illustrate, consider two examples. Both use the same data sample,
which is a four data-point, two year, price history composed of opens,
highs, lows, and closes, or a total of 2,080 data points.
Example one is a trading strategy that uses a 10-day average of highs
and a 50-day average of lows. Average one uses 11 degrees of freedom:
10 highs plus 1 more as a rule. Average two uses 51 degrees of freedom:
50 lows plus 1 as a rule. The total is 62 degrees of freedom used. To convert
that to a percentage, divide degrees of freedom used by total available
degrees of freedom. The result is 3 percent. This is perfectly acceptable.
Example two is a trading strategy that uses a 50-day average of closes
and a 150-day average of closes. Average one uses 51 degrees of freedom:
50 closes plus 1 as a rule. Average two uses only 102 degrees of freedom:
100 additional closes plus 1 as a rule. The total degrees of freedom used
are 152. Converting to a percentage, we get 7.3 percent.
While this is still acceptable, from these examples, it is easy to see
how adding more indicators and rules or decreasing sample size can easily
lead to decreased confidence in the results. This will be made clear in the
examples in the next section.


Than from help from the Adaptrade Builder

Poor out-of-sample performance is usually caused by one of several factors.
One important factor is the so-called number of degrees-of-freedom in the in-sample segment.
The number of degrees-of-freedom, which is equal to the number of trades minus the number of rules and conditions
of the strategy, determines how tightly the strategy fits the data. Provided inputs are added for each parameter
in the strategy, the number of strategy inputs can be used as a proxy for the number of rules and conditions.
For example, if a strategy has 100 trades and 10 inputs, it has 90 degrees-of-freedom.
The more degrees-of-freedom, the less likely it is that the strategy will be over-fit to the market and the more likely
it is that it will have good out-of-sample performance.

The number of degrees-of-freedom can be increased during the build process by adjusting the weights for the number
of trades and/or the strategy complexity. All other things being equal, increasing the performance weighting for the number of trades will result in strategies with more trades and therefore more degrees-of-freedom.
Likewise, increasing the performance weighting for the complexity metric will result in strategies with fewer inputs, which will also increase the number of degrees-of-freedom.

Builder also incorporates the degrees-of-freedom into the build process via the significance performance metric.
In Builder, “significance” is based on the Student’s t test applied to the average trade.
It measures the statistical significance of the average trade; that is, the probability that the average trade will be greater than zero. The t test is based on the number of degrees-of-freedom but is a more complete measure of whether a strategy is over-fit than the number of degrees-of-freedom alone. One way, then, to improve out-of-sample performance is to use the significance metric to generate strategies that have a high statistical significance.

Reply With Quote
The following user says Thank You to krzysiaczek99 for this post:

Can you help answer these questions
from other members on futures io?
EasyLanguage: fetch option price using Stock ticker
EasyLanguage Programming
Anybody uses Bookmap with Tradestation?
TradeStation
Simple "runner" code
EasyLanguage Programming
need help writing easylanguage radarscreen weekly perfor …
EasyLanguage Programming
Needing harmonic patterns indicator
TradeStation
 
Best Threads (Most Thanked)
in the last 7 days on futures io
How much do you know about Bitcoin?
97 thanks
FIO Journal Challenge - April 2021 w/Jigsaw Trading
38 thanks
I finally blew up an account
37 thanks
EdgeProX from Edge Clear
20 thanks
The tiyfTradePlanFactory indicator
19 thanks
 
(login for full post details)
  #3 (permalink)
krzysiaczek99
stockholm sweden
 
 
Posts: 9 since Sep 2012
Thanks: 1 given, 2 received


So I made a simple test. Considering Pardo calculations following strategy consumes following DFs

Length 25
Gain Limit 25
Thresh 100
BaseProf 7
ptStop 3
Reverse 1

Total 161 + a few for a rules.

Strategy is optimized on 75000 1 min bars, OOS period 1440 bars. Anchored WF test clearly shows that strategy is overfitted, all In sample net profit are positive, but OOS profits mostly negative.

than is left free like 99% of DFs. So it looks it don't work.

See attached screenschots.

I tried to clone this thread on Adaptrade google group but it didn't shown up on the list, most likely R. Bryant
(Adaptrade Builder/MSA seller) who is an administrator of this group just blocked it. LOVE IT !!!!!!!!!!!!!

My another post that one of the most important features of this tool - usage of custom indicator, is useless, was blocked also.....

Any comments ???

Krzysztof

Attached Thumbnails
Click image for larger version

Name:	df.jpg
Views:	86
Size:	58.0 KB
ID:	98379   Click image for larger version

Name:	wfo.jpg
Views:	79
Size:	381.8 KB
ID:	98380  
Reply With Quote
The following user says Thank You to krzysiaczek99 for this post:
 
(login for full post details)
  #4 (permalink)
 T993 
Sweden
 
Experience: Advanced
Platform: Ninja trader
Trading: Cl
 
Posts: 2 since Jan 2011
Thanks: 0 given, 0 received

1. Varför addera de värde indicatorerna använder. Borde väl räcka med att ta hänsyn till indikatorn med störst värde?

2. Om man då bara tradar chartpatterns. Borde hänsyn tas storleken på dessa dvs antalet perioder de sträcker sig över. Dessa kan ju oxå variera precis som med antal perioder för de som använder indikatorer. Vad gäller då här??

Mvh Stefan

Reply With Quote
 
(login for full post details)
  #5 (permalink)
krzysiaczek99
stockholm sweden
 
 
Posts: 9 since Sep 2012
Thanks: 1 given, 2 received


T993 View Post
1. Varför addera de värde indicatorerna använder. Borde väl räcka med att ta hänsyn till indikatorn med störst värde?

2. Om man då bara tradar chartpatterns. Borde hänsyn tas storleken på dessa dvs antalet perioder de sträcker sig över. Dessa kan ju oxå variera precis som med antal perioder för de som använder indikatorer. Vad gäller då här??

Mvh Stefan

Can you answer in English ?? I don't know Swedish. Beside this is English speaking forum

Krzysztof

Reply With Quote


futures io Trading Community Traders Hideout > optimization/overfitting and mess with degrees of freedom


Last Updated on January 1, 2013


Upcoming Webinars and Events
 

NinjaTrader Indicator Challenge!

Ongoing

NEW BlackBird Features + FOREX Support w/Jeremy Tang @ SharkIndicators

Elite only
 

Our 12-year anniversary w/ $$,$$$ prizes (check soon)

June
     



Copyright © 2021 by futures io, s.a., Av Ricardo J. Alfaro, Century Tower, Panama, Ph: +507 833-9432 (Panama and Intl), +1 888-312-3001 (USA and Canada), info@futures.io
All information is for educational use only and is not investment advice.
There is a substantial risk of loss in trading commodity futures, stocks, options and foreign exchange products. Past performance is not indicative of future results.
no new posts