NexusFi: Find Your Edge


Home Menu

 





how much data do I need


Discussion in Traders Hideout

Updated
      Top Posters
    1. looks_one mcteague with 3 posts (0 thanks)
    2. looks_two mokodo with 2 posts (2 thanks)
    3. looks_3 Quick Summary with 1 posts (0 thanks)
    4. looks_4 kevinkdog with 1 posts (1 thanks)
    1. trending_up 3,231 views
    2. thumb_up 3 thanks given
    3. group 2 followers
    1. forum 6 posts
    2. attach_file 0 attachments




 
Search this Thread

how much data do I need

  #1 (permalink)
 mcteague 
New York NY USA
 
Experience: Intermediate
Platform: esignal, thinkorswim,
Trading: Stocks
Posts: 122 since Oct 2012
Thanks Given: 63
Thanks Received: 35

How much data do I need for reliable statistical or probabilistic conclusions regarding strategies
I did some testing on a strategy and was kind of shocked at how much results varied when I changed bar times. So I wanted to know how much data do i need to have reliable results. I thought a year would be enough. Or is there just to much variance for that to ever be really useful.
In addition to testing strategies there are other more basic questions I would like to answer. For example I trade the ES. If I am up 1 point should I cash or try for more. This should be something that can be known from the data. Maybe by not taking the correct statistical action I am just throwing money away. How much data do I need to be sure.

I use ninjatrader, which may not really be appropriate for the kind of analysis I am looking to do. If I need something else please let me know. Also if most of the statistical work has has already been done, and is known, or there are templates for spreadsheets I am more than willing to use them. I don't need to reinvent the wheel.

Thanks

Started this thread Reply With Quote

Can you help answer these questions
from other members on NexusFi?
MC PL editor upgrade
MultiCharts
Pivot Indicator like the old SwingTemp by Big Mike
NinjaTrader
How to apply profiles
Traders Hideout
NT7 Indicator Script Troubleshooting - Camarilla Pivots
NinjaTrader
REcommedations for programming help
Sierra Chart
 
Best Threads (Most Thanked)
in the last 7 days on NexusFi
Spoo-nalysis ES e-mini futures S&P 500
48 thanks
Just another trading journal: PA, Wyckoff & Trends
33 thanks
Tao te Trade: way of the WLD
24 thanks
Bigger Wins or Fewer Losses?
24 thanks
GFIs1 1 DAX trade per day journal
22 thanks
  #3 (permalink)
 
mokodo's Avatar
 mokodo 
Bridgwater, UK
 
Experience: Beginner
Platform: Ninjatrader
Broker: MB Trading
Trading: Forex
Posts: 385 since Jun 2011
Thanks Given: 525
Thanks Received: 348



mcteague View Post
How much data do I need for reliable statistical or probabilistic conclusions regarding strategies
I did some testing on a strategy and was kind of shocked at how much results varied when I changed bar times. So I wanted to know how much data do i need to have reliable results. I thought a year would be enough. Or is there just to much variance for that to ever be really useful.
In addition to testing strategies there are other more basic questions I would like to answer. For example I trade the ES. If I am up 1 point should I cash or try for more. This should be something that can be known from the data. Maybe by not taking the correct statistical action I am just throwing money away. How much data do I need to be sure.

I use ninjatrader, which may not really be appropriate for the kind of analysis I am looking to do. If I need something else please let me know. Also if most of the statistical work has has already been done, and is known, or there are templates for spreadsheets I am more than willing to use them. I don't need to reinvent the wheel.

Thanks

"reliable statistical or probabilistic conclusions" do not exist. Reliable suggests that nothing in the future will be outside the extremes of your test data. And of course you can not know that. You can counteract this to a degree by using as much data as possible, but it still does not overcome the problem.

Take a look at this webinar for a pros insights into strategy/backtesting development.

Webinar: Creating an Algorithmic Trading System

Happy trading to you

know thyself
Visit my NexusFi Trade Journal Reply With Quote
Thanked by:
  #4 (permalink)
 kevinkdog   is a Vendor
 
Posts: 3,664 since Jul 2012
Thanks Given: 1,892
Thanks Received: 7,359

I personally recommend using as much good data as you can (10 years has worked well for me). This will also allow you to test your system in multiple volatile markets, quiet markets, bull, bear and flat markets, etc.

Some people will disagree, and say "use only 1 year or 2 of data, since markets are much different now than 10 years ago." True, but markets will be different 2 years from now, also. A robust approach, built on lots of historical data, will be more likely to handle these future different markets, hopefully.

Also, more data means more trades. More trades in your sample reduces the chances that your results were due to just pure luck.

The biggest disadvantage to using a lot of data is that most strategies can't survive the test. Being good over a year is a lot easier to find than being good over 10 years. So, many people take the easy road, and end up losing.

As far as "should I cash in after 1 pt" that is something you should test. For your approach it might improve things, but it might not.


Good Luck!

Follow me on Twitter Reply With Quote
Thanked by:
  #5 (permalink)
 mcteague 
New York NY USA
 
Experience: Intermediate
Platform: esignal, thinkorswim,
Trading: Stocks
Posts: 122 since Oct 2012
Thanks Given: 63
Thanks Received: 35

Thanks for the link.
However I would question your definition of reliable. It does not mean infallible. Proper results to "useful", perhaps a word you would prefer", should be within some standard range. When they are not either an unrecognized important condition has changed or the original conclusions were based on insufficient data.
In the case of the results differing greatly when I move from 5 minute to 30 minute bars I assume it is the latter because the other conditions are basically the same.

I am trying to take what we might call the sabermetric approach to trading. Although my math is hardly sufficient for the task. But do want to see what the data can tell me.

Cheers




mokodo View Post
"reliable statistical or probabilistic conclusions" do not exist. Reliable suggests that nothing in the future will be outside the extremes of your test data. And of course you can not know that. You can counteract this to a degree by using as much data as possible, but it still does not overcome the problem.

Take a look at this webinar for a pros insights into strategy/backtesting development.

Webinar: Creating an Algorithmic Trading System

Happy trading to you


Started this thread Reply With Quote
  #6 (permalink)
 
mokodo's Avatar
 mokodo 
Bridgwater, UK
 
Experience: Beginner
Platform: Ninjatrader
Broker: MB Trading
Trading: Forex
Posts: 385 since Jun 2011
Thanks Given: 525
Thanks Received: 348


mcteague View Post
Thanks for the link.
However I would question your definition of reliable. It does not mean infallible. Proper results to "useful", perhaps a word you would prefer", should be within some standard range. When they are not either an unrecognized important condition has changed or the original conclusions were based on insufficient data.
In the case of the results differing greatly when I move from 5 minute to 30 minute bars I assume it is the latter because the other conditions are basically the same.

I am trying to take what we might call the sabermetric approach to trading. Although my math is hardly sufficient for the task. But do want to see what the data can tell me.

Cheers

I'm not knocking the process of coding/backtesting, etc. I do it myself and it is a long, painful process with very very many dead ends. It is the question of how useful it all turns out to be in reality. I have found that Monte Carlo testing is a good way to explore the worst case scenarios that a system may through up, once you have something showing promise. That way you can at least sketch a picture of the risks. This is a free one which I found very helpful.

Equity Monaco | TickQuest Inc.

I suppose my point about past data and how useful it turns out to be is really about expectations. There is a common assumption that what happens in the past creates the boundaries for what can happen in the future and many traders fall into the trap of optimising on that basis. They optimise their capital and expectations - with little or no redundancy allowed to accommodate things that did not happen in the past. This is like saying the tallest mountain in the world can be no bigger than the tallest mountain you have seen. Ask yourself, is it better to have a potentially wrong or misleading map, or no map at all?

My experiences with strategy development is that it ending up being very helpful to gain an understanding of the 'character' of a particular strategy and that was very helpful to get the confidence to take the approach live - as a discretionary trader.

know thyself
Visit my NexusFi Trade Journal Reply With Quote
  #7 (permalink)
 mcteague 
New York NY USA
 
Experience: Intermediate
Platform: esignal, thinkorswim,
Trading: Stocks
Posts: 122 since Oct 2012
Thanks Given: 63
Thanks Received: 35

I just wanted to follow up by saying that I watched the video and learned a lot. So thanks again.
I am not yet doing this to develop automated trading systems. Although I think like a lot of people I look at some of the overnight moves in the ES and think it would be nice to grab a little slice while sleeping.

Mostly I am trying to test what I do manually. Or things I have heard or read. In the computer age it seems inexcusable not to test your setups statistically. I am still rather shocked though when making some (what I think) small change in a stop strategy will change a successful trade rate from 78% to 43%. Or when I did one that had a success rate over 90% that still lost money.

I am going to use larger data sets going forward. But a year was not a small amount of data. Especially for 5 minute bars. Even if the larger set is better, it suggests to me that the variance is always going to be extremely high. Well I am not really experienced enough to conclude that. But trying to find lower variance needs to be a major part of system design whether automated or manual.

Cheers

Started this thread Reply With Quote




Last Updated on January 12, 2013


© 2024 NexusFi™, s.a., All Rights Reserved.
Av Ricardo J. Alfaro, Century Tower, Panama City, Panama, Ph: +507 833-9432 (Panama and Intl), +1 888-312-3001 (USA and Canada)
All information is for educational use only and is not investment advice. There is a substantial risk of loss in trading commodity futures, stocks, options and foreign exchange products. Past performance is not indicative of future results.
About Us - Contact Us - Site Rules, Acceptable Use, and Terms and Conditions - Privacy Policy - Downloads - Top
no new posts