NexusFi: Find Your Edge


Home Menu

 





Statistical significance


Discussion in Traders Hideout

Updated
    1. trending_up 3,142 views
    2. thumb_up 8 thanks given
    3. group 2 followers
    1. forum 6 posts
    2. attach_file 0 attachments




 
Search this Thread

Statistical significance

  #1 (permalink)
 
Zwaen's Avatar
 Zwaen 
Netherlands, Blaricum
 
Experience: Intermediate
Platform: Excel, Python, R
Broker: IB
Trading: Options
Posts: 250 since Dec 2010
Thanks Given: 848
Thanks Received: 238

Suppose you have a database of trades A.

Suppose you want to use a simple ‘filter ‘ to improve statistics. Lets say this is a simple filter which in your opinion is valid, eg don’t take trades after time y. Reason is that after time y price has not enough time/opportunity to travel to your desired target( you don’t go overnight, and there is no exceptional volatility). Another example would be when you want to analyze if your stoploss and profit target can be made dependent on current volatility. Offcourse, the reason could be anything, as long as the trader can suspect there lies some causality or logical thinking behind the rationale to use the filter.

We don’t want to use too many filters, since more filters means more chance of curve fitting.
What number of trades B in dataset A is sufficiënt to be statistically significant? And if this depends on the number in database A, how would you think/know this relation yields?

At this moment I use a rule of thumb that is around 30 ( law of diminishing returns from statistics) observations. So there need to be a minimum of around 30 (=B) observations to let the filter be of any significant value. I also look at scatterplots/histograms of the data and use common sense ( asuming I have some) to ascertain if the difference is statistically significant. But I wonder if others use a more methodical approach to this?

One of my worst enemies are my own false assumptions
Started this thread Reply With Quote

Can you help answer these questions
from other members on NexusFi?
Better Renko Gaps
The Elite Circle
Pivot Indicator like the old SwingTemp by Big Mike
NinjaTrader
NT7 Indicator Script Troubleshooting - Camarilla Pivots
NinjaTrader
Trade idea based off three indicators.
Traders Hideout
MC PL editor upgrade
MultiCharts
 
Best Threads (Most Thanked)
in the last 7 days on NexusFi
Diary of a simple price action trader
26 thanks
Just another trading journal: PA, Wyckoff & Trends
25 thanks
Tao te Trade: way of the WLD
22 thanks
My NQ Trading Journal
16 thanks
HumbleTraders next chapter
9 thanks
  #3 (permalink)
 
ericbrown's Avatar
 ericbrown 
Tulsa, OK
 
Experience: Advanced
Platform: Tradestation, TOS, Python
Broker: IQFeed, Tradestation, TOS
Trading: ES, SPY, Options
Posts: 201 since Jan 2011
Thanks Given: 339
Thanks Received: 258


A rule of thumb of 30 observations helps with the Central Limit Theorem issues, but it doesn't prove statistical significance.

I would suggest start here -> https://www.csulb.edu/~msaintg/ppa696/696stsig.htm

From that page, you need to answer these two questions:


Quoting 
1) what is the probability that the relationship exists;
2) if it does, how strong is the relationship

To answer those questions, you'll need to determine how best to study the relationship between the number of trades "B" and the number of trades in your database "A".

Follow me on Twitter Reply With Quote
Thanked by:
  #4 (permalink)
 
Zwaen's Avatar
 Zwaen 
Netherlands, Blaricum
 
Experience: Intermediate
Platform: Excel, Python, R
Broker: IB
Trading: Options
Posts: 250 since Dec 2010
Thanks Given: 848
Thanks Received: 238


ericbrown View Post

Interesting link, thanks! I will certainly read this.

One of my worst enemies are my own false assumptions
Started this thread Reply With Quote
  #5 (permalink)
 
ericbrown's Avatar
 ericbrown 
Tulsa, OK
 
Experience: Advanced
Platform: Tradestation, TOS, Python
Broker: IQFeed, Tradestation, TOS
Trading: ES, SPY, Options
Posts: 201 since Jan 2011
Thanks Given: 339
Thanks Received: 258


Zwaen View Post
Interesting link, thanks! I will certainly read this.

Welcome.

There's a lot on that page to take in. I use quite a bit of those analysis techniques in my own work. I'm not that great at stats but I know some of the basics so feel free to ask questions.

Follow me on Twitter Reply With Quote
Thanked by:
  #6 (permalink)
 
Zwaen's Avatar
 Zwaen 
Netherlands, Blaricum
 
Experience: Intermediate
Platform: Excel, Python, R
Broker: IB
Trading: Options
Posts: 250 since Dec 2010
Thanks Given: 848
Thanks Received: 238


ericbrown View Post
Welcome.

There's a lot on that page to take in. I use quite a bit of those analysis techniques in my own work. I'm not that great at stats but I know some of the basics so feel free to ask questions.



Hi Eric,

Thanks! I did the following calculation, and wondered if my asumptions and calculations correct, or do I make some mistakes?

I will use simple numbers. Reality is offcourse not so simple, but to illustrate the idea, I either win or lose, given a fixed amount ( target is always t, loss is always l, but are not relevant for calculations)

Suppose you have a set of 200 trades ( set A) which have positive EV. You want to evaluate if filter B is relevant. Filter B contains 25 trades, and has negative EV. We want to know if the 25 trades are significant, so we compare the 2 distributions.

Set A:
200 trades
80 trades are closed at target.
120 trades are stopped out.
Pclose target = 80/200= 0.40
Pstopped out/close trade=120/200=0.60

Set B( filter):
25 trades
5 trades are closed at target
20 trades are stopped out.
Pclose target = 5/25= 0.20
Pstopped out/close trade=20/25=0.80

For set B having the same statistics as set A, distribution B would be:
Number of trades closed at target = 0.40*25=10
Number of trades stopped out/close trade = 0.60*25=15

Then
(5-10)^2/5 = 5.0
(20-15)^2/20 =1.25
Sum= 6.25

Degrees of freedom =(2-1)*(2-1)=1

Then I see in table https://sites.stat.psu.edu/~mga/401/tables/Chi-square-table.pdf
that 6.25 lies between 2.5 and 1%, so I can say with a chance of being wright of 97,5-99% that these distributions are significantly different?!

One of my worst enemies are my own false assumptions
Started this thread Reply With Quote
Thanked by:
  #7 (permalink)
 
ericbrown's Avatar
 ericbrown 
Tulsa, OK
 
Experience: Advanced
Platform: Tradestation, TOS, Python
Broker: IQFeed, Tradestation, TOS
Trading: ES, SPY, Options
Posts: 201 since Jan 2011
Thanks Given: 339
Thanks Received: 258


Zwaen View Post
Hi Eric,

Thanks! I did the following calculation, and wondered if my asumptions and calculations correct, or do I make some mistakes?

I will use simple numbers. Reality is offcourse not so simple, but to illustrate the idea, I either win or lose, given a fixed amount ( target is always t, loss is always l, but are not relevant for calculations)

Suppose you have a set of 200 trades ( set A) which have positive EV. You want to evaluate if filter B is relevant. Filter B contains 25 trades, and has negative EV. We want to know if the 25 trades are significant, so we compare the 2 distributions.

Set A:
200 trades
80 trades are closed at target.
120 trades are stopped out.
Pclose target = 80/200= 0.40
Pstopped out/close trade=120/200=0.60

Set B( filter):
25 trades
5 trades are closed at target
20 trades are stopped out.
Pclose target = 5/25= 0.20
Pstopped out/close trade=20/25=0.80

For set B having the same statistics as set A, distribution B would be:
Number of trades closed at target = 0.40*25=10
Number of trades stopped out/close trade = 0.60*25=15

Then
(5-10)^2/5 = 5.0
(20-15)^2/20 =1.25
Sum= 6.25

Degrees of freedom =(2-1)*(2-1)=1

Then I see in table https://sites.stat.psu.edu/~mga/401/tables/Chi-square-table.pdf
that 6.25 lies between 2.5 and 1%, so I can say with a chance of being wright of 97,5-99% that these distributions are significantly different?!

Not having done the calculations myself with your data, I can't say for certainty that this is correct...but a quick glance I can't see anything wrong.

Regarding interpretation, you are testing that the distributions are different. Your null hypothesis is that they are the same or similar.

With your data, the Chi-Square of 6.25 is greater than the p=.025 for df=1, therefore you can reject the null hypothesis (with a 2.5% probability of error) that the distributions are the same. You can't really say they are significantly different...you can only say that the null hypothesis is rejected, which in your case is what you want to see.

Follow me on Twitter Reply With Quote
Thanked by:




Last Updated on September 3, 2014


© 2024 NexusFi™, s.a., All Rights Reserved.
Av Ricardo J. Alfaro, Century Tower, Panama City, Panama, Ph: +507 833-9432 (Panama and Intl), +1 888-312-3001 (USA and Canada)
All information is for educational use only and is not investment advice. There is a substantial risk of loss in trading commodity futures, stocks, options and foreign exchange products. Past performance is not indicative of future results.
About Us - Contact Us - Site Rules, Acceptable Use, and Terms and Conditions - Privacy Policy - Downloads - Top
no new posts