NexusFi: Find Your Edge


Home Menu

 





Learning statistical analysis: Step by Step


Discussion in Traders Hideout

Updated
      Top Posters
    1. looks_one jackbravo with 11 posts (4 thanks)
    2. looks_two wldman with 5 posts (5 thanks)
    3. looks_3 iantg with 4 posts (7 thanks)
    4. looks_4 Ozquant with 3 posts (4 thanks)
      Best Posters
    1. looks_one iantg with 1.8 thanks per post
    2. looks_two Ozquant with 1.3 thanks per post
    3. looks_3 wldman with 1 thanks per post
    4. looks_4 jackbravo with 0.4 thanks per post
    1. trending_up 8,547 views
    2. thumb_up 29 thanks given
    3. group 13 followers
    1. forum 29 posts
    2. attach_file 9 attachments




 
Search this Thread

Learning statistical analysis: Step by Step

  #1 (permalink)
 
jackbravo's Avatar
 jackbravo 
SF, CA/USA
 
Experience: Beginner
Platform: SC
Broker: Stage 5
Trading: NQ...uh..ES actually
Posts: 1,337 since Jun 2014
Thanks Given: 4,362
Thanks Received: 2,400

Have you wondered how people generate those numbers on market data, like 56% of Mondays closed up in 2010? Yea, me too.

One of my goals this year to learn statistical analysis techniques to see if I can use statistics to support my discretionary trades. While I have a general concept of probability, I don't know much about specific statistics or how to go about generating that data. So in this thread, I will document my attempts to understand both statistical concepts and techniques in a step-wise fashion, which will hopefully be beneficial to others as well.

Any comments, feedback, help, or guidance is appreciated!

Happy 2018!

"It does not matter how slowly you go, as long as you do not stop." Confucius
Started this thread Reply With Quote
Thanked by:

Can you help answer these questions
from other members on NexusFi?
Are there any eval firms that allow you to sink to your …
Traders Hideout
Ninja Mobile Trader VPS (ninjamobiletrader.com)
Trading Reviews and Vendors
Online prop firm The Funded Trader (TFT) going under?
Traders Hideout
Build trailing stop for micro index(s)
Psychology and Money Management
Exit Strategy
NinjaTrader
 
Best Threads (Most Thanked)
in the last 7 days on NexusFi
Get funded firms 2023/2024 - Any recommendations or word …
59 thanks
Funded Trader platforms
37 thanks
NexusFi site changelog and issues/problem reporting
22 thanks
GFIs1 1 DAX trade per day journal
22 thanks
The Program
20 thanks
  #3 (permalink)
 
jackbravo's Avatar
 jackbravo 
SF, CA/USA
 
Experience: Beginner
Platform: SC
Broker: Stage 5
Trading: NQ...uh..ES actually
Posts: 1,337 since Jun 2014
Thanks Given: 4,362
Thanks Received: 2,400


I read Adam Grime's book, "Quantitative Analysis of Market Data: a Primer." It's a very short book, but has a great overview of market math for beginners like me. One concept he talks about is standardizing the price of securities to themselves, in order to really figure out how they moved. There are two similar ways to do that:

1. Percent returns
2. Log returns

Percent returns are calculated as:
(Price today / Price yesterday) - 1

Log returns are calculated as:
Log (price today / price yesterday)

One of the main differences between the two is how you can manipulate them. Percent returns cannot be added, because a 10% followed by a 10% gain does not equal 0. However, log returns can be added together to get an overall picture of true returns (and the effects of volatility) - according to my understanding.

To learn how to do this, I
1. Downloaded daily data for SPY for the year 2010 into notepad
2. Pasted into excel the comma-delimited data. Excel split it up automatically into columns when I selected the correct delimiter (a comma).
3. I then created two columns, using closing prices for calculations. The first column is percent returns, and the 2nd column is log returns.




For percent returns, I took closing price at the last day of the year, and closing price on the first day of the year (maybe I should have used opening price on the first day of the year). For log returns, I summed all the log returns in the column; using the closing prices of the first day and last day also works and gives the same result. What's interesting is that from close to close, SPY gained 10.96%. However, because of the volatility (i.e. negative returns), it only returned 4.52%. If these calculations are incorrect, I'd appreciate any feedback.

"It does not matter how slowly you go, as long as you do not stop." Confucius
Started this thread Reply With Quote
  #4 (permalink)
Ozquant
Brisbane Queensland Australia
 
Posts: 220 since Aug 2017
Thanks Given: 167
Thanks Received: 380

Its so much easier to do that in a charting software that has coding facilities . Do one code and you can apply it to any data set easily .. That said i do use Excel to do stock balance sheet analysis , its a very handy tool and if you got VBA skills that shifts across to many chart softwares as well . I think if you dont do at least some Quantitative analysis you are in danger of going the way of the dinosaurs in this game . Lots of chatter about AI machine learning but i still think the human brain will lead the way for a few more years . Just recently got into machine learning and i think a lot of it is advanced curve fitting with minimal if no real reflection of price action > still got a ways to go before thinking is redundant imo

Reply With Quote
Thanked by:
  #5 (permalink)
 iantg 
charlotte nc
 
Experience: Advanced
Platform: My Own System
Broker: Optimus
Trading: Emini (ES, YM, NQ, ect.)
Posts: 408 since Jan 2015
Thanks Given: 90
Thanks Received: 1,147


jackbravo View Post
Have you wondered how people generate those numbers on market data, like 56% of Mondays closed up in 2010? Yea, me too.

One of my goals this year to learn statistical analysis techniques to see if I can use statistics to support my discretionary trades. While I have a general concept of probability, I don't know much about specific statistics or how to go about generating that data. So in this thread, I will document my attempts to understand both statistical concepts and techniques in a step-wise fashion, which will hopefully be beneficial to others as well.

Any comments, feedback, help, or guidance is appreciated!

Happy 2018!

Statistics are a great tool and there are ton of ways to use it to help improve your trading. I work in a quantitative field by trade and use advanced statistics all the time, but in terms of how I have applied to trading, I found that what really helped me was a quite simple application of calculating the house edge vs my edge.

Here is how it works: Start with the house edge calculation.

1. You give up the spread if you use market orders. So if nothing else is in play your odds of losing are 2 to 1.
2. If you use a limit order for your profit target and your market order for your stop loss, your odds of hitting your stop loss on first touch are 100%, and your odds of hitting your profit target on first touch are very low, so it may take 2-3 touches on average. But the further away you are from your profit target, the longer you will be waiting in the queue, thus the higher this probability increases. Depending on your profit target you can figure out a good baseline model. For a simple example it may look something like this.

1 tick PT: Full pass through most likely
2 tick PT: 3 to 4 touches or full pass through
3 tick PT 2 to 3 touches or full pass through
etc..... Eventually when you get around a really high target, you will be more towards the front of the queue. But you have to factor this in.

Here is a simple statistical model to illustrate this type of built in house edge against you. If you set a PT of 5 ticks, and a SL of 5 ticks and use limit orders for entries and assume that you somehow come out with 100% perfect fills for your entries what is your odds of winning based only on your exits. Some would say 50% / 50%, but in reality this is more like a 30% probability of winning due to the fact that every stop loss gets hit and filled immediately, whereas your PT needs multiple touches or pass thoughts.

With this simple example in mind, you can back this up to different PT vs. SL combinations and try to calculate your odds, you will find that there is a whole science to setting the house edge, and every trader needs to know this.

So let's talk about quantifying your own edge.... In order to determine if something has an edge to it you need two samples to test it. One with the edge in place and one without it as a control. When you see an improvement from your edge test vs. your control test, you can calculate how much this helped. There are a number of ways to measure it, but most of these are very simple.

a. Did you increase your win / loss ratio
b. Did you decrease your drawdowns
c. Did you increase the number of trades that got filled
d. Did you increase your average profit per trade.

You can easily quantify all of these, with every variable you add or subtract from your strategy.

You can test various entry systems to determine:

a. Did I get filled on touch or pass through if this was a limit order.
b. Did I end the first bar of the trade flat up or down
c. Am I filtering too many trades out, thus not taking enough trades in a day to hit my goal?


You can tests various exit systems to figure out.

a. Did my PT vs. SL have a good expectancy
b. How much slippage did my system have
c. Did I cut my trade short instead of capturing additional profits.
d. What was my MAE / MFE for my trades


So these are the key questions you should focus on, and everything is measurable and quantifiable. Eventually you can start to blend various parts of your system together to calculate your cumulative edge, vs the markets house edge to see if it is positive or negative overall. Once you find something that is logically and mathematically viable, then from there it is all about infrastructure and execution.

I could give further information it you have any questions, but I hope this gives you some generic ideas to get started with.

Good Luck!

Ian

Visit my NexusFi Trade Journal Reply With Quote
Thanked by:
  #6 (permalink)
 
Popsicle's Avatar
 Popsicle 
Pretoria Gauteng
 
Experience: Intermediate
Platform: Sierra Charts
Trading: NQ
Posts: 250 since May 2016
Thanks Given: 2,448
Thanks Received: 550

This is something that I decided to look into deeper in the last month or so too.

I have been following https://metricsmaestro.wordpress.com/ for quite a while, he tracks statistics mainly for the ES and SPY. Some very interesting statistics there. What I also like about the way he tracks statistics is he shows the practical application for the stats too: https://metricsmaestro.wordpress.com/category/playbook/.

The approach I decided to take is to start with one instrument and build a base from there. I will definitely contribute to the thread when I have something useful.

Follow me on Twitter Reply With Quote
Thanked by:
  #7 (permalink)
 
jackbravo's Avatar
 jackbravo 
SF, CA/USA
 
Experience: Beginner
Platform: SC
Broker: Stage 5
Trading: NQ...uh..ES actually
Posts: 1,337 since Jun 2014
Thanks Given: 4,362
Thanks Received: 2,400

@Ozquant - thanks for your input. I trade using Sierrachart, so it's programmable. But I don't know how to mathematically ask the questions I have, and then how to statistically analyze the answers. I'm really trying to build from Step 1 at this point...which is how to transform my question into an equation I can program into Sierrachart. I'm using excel because it simulates SC and is less cumbersome.

@Popsicle - thanks for the link. I'll save that info to look at. It'd be great if he could show how he actually got those numbers, that's really what I'm trying to figure out at the point...not really the results, but the mechanics.

@iantg - thanks for your extensive write-up. I'm to go through it step by step. I have a lot of questions for you, and I will ask them as I work through your list.

So for the House Edge calculation:

iantg
1. You give up the spread if you use market orders. So if nothing else is in play your odds of losing are 2 to 1.

How do I figure this out? I drew a simulation of buying a market order. It seems that there are 3 ways to lose money, and one way to win money, which would be odds against you 3:1



What do you think?

"It does not matter how slowly you go, as long as you do not stop." Confucius
Started this thread Reply With Quote
Thanked by:
  #8 (permalink)
 iantg 
charlotte nc
 
Experience: Advanced
Platform: My Own System
Broker: Optimus
Trading: Emini (ES, YM, NQ, ect.)
Posts: 408 since Jan 2015
Thanks Given: 90
Thanks Received: 1,147

Jackbravo,

Thanks for following up, and looking into this. I think there are a number of fair ways to get to this and your method is certainly not incorrect by any means. The way that I build my edge calculations is to run each variable independent. So regarding my assessment that you have 2 to 1 odds of losing. It is just setting the bet line with the following assumptions. (I didn't mention these previously.)

Here are the assumptions and the betting line.

Profit = 2 ticks
Loss = 2 ticks

Odds of Winning Odds of Losing
Flat Entry 50% 50%
Entry with 1 tick profit 75% 25%
Entry with 1 Tick Loss 25% 75%

So because you give up the spread, you typically end up with 1 tick against you from the start. So your odds of hitting a 2 tick loss before you hit a 2 tick profit are 2 to 1. Now if you're PT / SL targets were different, you would see different odds / betting lines.

Every other aspect of the house edge such as the commissions that hit you regardless of if you win or lose are all applicable in calculating the overall house edge, but I keep each part separate, because as if I change the PT / SL targets for example, it only changes this aspect but other parts would stay constant.

There is no real right or wrong way to do this, but I typically try to break everything down as granular as possible, so that if and when I change things I can see the impact of the one change independently of everything else.

Hope it helps.

Visit my NexusFi Trade Journal Reply With Quote
Thanked by:
  #9 (permalink)
 
jackbravo's Avatar
 jackbravo 
SF, CA/USA
 
Experience: Beginner
Platform: SC
Broker: Stage 5
Trading: NQ...uh..ES actually
Posts: 1,337 since Jun 2014
Thanks Given: 4,362
Thanks Received: 2,400


iantg
Profit = 2 ticks
Loss = 2 ticks

Odds of Winning Odds of Losing
Flat Entry 50% 50%
Entry with 1 tick profit 75% 25%
Entry with 1 Tick Loss 25% 75%

I guess I don't know how the odds are calculated in the first place. I drew a table of my understanding so far. I get that it's a 50/50 shot from flat to +2/-2, but how do you calculate the odds at -1 tick starting? Thanks for any help!


"It does not matter how slowly you go, as long as you do not stop." Confucius
Started this thread Reply With Quote
  #10 (permalink)
 
wldman's Avatar
 wldman 
Chicago Illinois USA
Legendary Market Wizard
 
Experience: Advanced
Broker: IB, ToS
Trading: /ES, US Equities/Options
Frequency: Several times daily
Duration: Hours
Posts: 3,507 since Aug 2011
Thanks Given: 2,046
Thanks Received: 9,491


Are you trying to develop a statistical trading model? I have/had one that I liked but I am no longer pointed at black/grey box and my time frame has expanded. BUT I'd revive it if you wanted to work on it. I think I can find the documents.

Dan

Visit my NexusFi Trade Journal Reply With Quote
Thanked by:




Last Updated on January 28, 2018


© 2024 NexusFi™, s.a., All Rights Reserved.
Av Ricardo J. Alfaro, Century Tower, Panama City, Panama, Ph: +507 833-9432 (Panama and Intl), +1 888-312-3001 (USA and Canada)
All information is for educational use only and is not investment advice. There is a substantial risk of loss in trading commodity futures, stocks, options and foreign exchange products. Past performance is not indicative of future results.
About Us - Contact Us - Site Rules, Acceptable Use, and Terms and Conditions - Privacy Policy - Downloads - Top
no new posts