Mostly as a note to myself and at the risk of putting the cart before the horse, this post will briefly describe how to install RDotNet for MS Windows. I (think I) want RDotNet because I've been using R to pre-whiten time series, in this case OHLC bar close prices, in the R/S analysis project mentioned in the last post and it would seem more convenient to invoke the required R routines from the main app if possible.

The 4 step RDotNet installation procedure is as follows:

1. If not already present install the binary for R on the development system. Choose the x64 version if the development system is 64 bit. The latest version (2.15.2 at time or writing) can be obtained from The R Project for Statistical Computing and the downloaded .exe file is simply run to install R. [R source is also available on the project site, although I've never tried to build it from scratch]

2. If newly installed, run R from the desktop icon (or otherwise) to test the installation, then exit.

3. Download the RDotNet R.DLL from R.NET - Home and place it on the development system, noting the path to the .DLL. [Source code for R.DLL is also available at the RDotNet site]

4. Write a "Hello World" app to test the configuration (example of an MS VS 2010 console app below, MS VS 2010 project attached as a .ZIP file)

Notes:

1. The lines in the test app referring to the environment were necessary in my case because I don't have variable "RHOME" in the path for the console environment.

2. Add R.dll as a reference in the test project file before compiling

3. Usual Disclaimer re the attachment to this post: While the system on which the project was created has no network connection and is scanned routinely for malware (implying the .ZIP file & contents were likely originally malware-free) all bets are off once it transits the Interwebz. Rather than bother with e.g. PGP signing (cumbersome and IMO overkill in this case) recommend scanning the .ZIP file before opening.

4. I will probably post the R/S analysis code as is (before interfacing it to R) as an MS VS 2010 project in the next day or 2, delay because the GUI needs to be more user friendly, along with a few notes about the concepts involved and how to use the program.

Can you help answer these questions from other members on futures io?

I've attached the latest version ("0.9.0") of a platform to study Hurst exponents as an MS VS 2010 project file ("HurstExponent.zip"). The file ("Form1.cs") containing the meat and potatoes (the algorithms) can be extracted by itself from the project .ZIP file if desired.

A data file (text format) containing 235 EUR.USD close prices in the required format (referred to below) is attached for testing, if desired. Note that the data has NOT been prewhitened (see McKenzie's paper attached for a discussion). This is not a show stopper for testing purposes.

I've also attached a copy of 2 articles (apparently in the public domain) that explain the concepts, the first of which IMO is one of the better written papers on the topic (Michael D. McKenzie, “Non-Periodic Australian Stock Market Cycles: Evidence from Rescaled Range Analysis”, The Economic Record, 2001, vol. 77, issue 239, pages 393-406) based on Mandelbrot's "classical" R/S statistic. The 2nd article is Chapter 6 excerpted from "A Non-Random Walk Down Wall Street". Andrew W. Lo & A. Craig MacKinlay, Princeton University Press, 2001 (entire book in PDF format available apparently for free at Contents for Lo & MacKinlay: A Non-Random Walk Down Wall Street)

The 2nd article modifies the classical definition of the R/S statistic, which modification I may incorporate into a later version of the program.

What the HurstExponent program does

The program uses R/S analysis to calculate the Hurst exponent, fractal dimension, V-statistic and R/S expected values (for a random walk with the same sample size and time scales) associated with either a user-supplied time series or a pseudo random sequence generated by the program itself.

Basic R/S ("Range/Standard Deviation") approach is summarized in this excerpt from the attached book chapter by Lo et al.:

Why Do It

Traders sometimes use the term "fractal" when talking about time frames, including Barry Burns (author of the 5 energies system). Since like a lot of traders I use 3 time frame charts (short, medium and long) defined by 200-, 600- and 1800-tick bars (multiples of 3) I wondered

1. whether they are in fact fractals as BB implies;
2. if so, are there any fractal properties that can be exploited in a strategy beyond the (non fractal) conventional use (i.e., seeing the bigger/smaller picture bracketing the setup chart);
3. in any event, is there anything magic about multiples of 3 and in general is there any basis for a better choice of bar interval multiple among the 3 charts; and,
4. is there anything fractal in the related bot thought pattern (red/green/blue/yellow beaded experimental indicator at the bottom of the following screenshot. Multicharts indicator to be published in a later post.)

As mentioned in a previous post fractal dimension is simply related to the Hurst exponent and R/S analysis is a means to calculate the Hurst exponent as well as a number of other interesting and potentially useful statistics, so in the first instance the answer to "Why Do It" is to study the fractal properties of financial time series for fun and profit.

That said, first results suggest that 200-, 600- and 1800-tick chart samples of EUR.USD close prices over the limited time frames I use exhibit only very slight fractal behaviour and probably should not be called fractals. This probably applies to the bot thought process as well. We will therefore likely need some other way of modelling what different time frames tell us.

However, R/S analysis of the data itself (in particular the V Statistic) does indicate price action deviates significantly from random behaviour, which may startle some academics (believers in market efficiency) but comes as no surprise to traders. The question remains the usual therefore, not so much "why do it" as "can a quantitative study of this deviation via R/S analysis improve a trader's edge?"

Program Operation

The R/S analyzer program interface is shown below after it first opens. At this point the user has 2 options as suggested by the instructions on the top/left of the GUI:

1. Open a file containing a time series via "File > Open..." and analyze it (file format described in the last section, "Input File Format"); or,
2. Generate a pseudo random sequence and analyze that.

The random sequence is controlled by 2 parameters, namely

1. Length (default value 262144, or 2^18); and,
2. Repetition Factor (default value 1).

The Length parameter is just the maximum potential number of samples in the pseudo random sequence to be analyzed. The Repetition Factor tells the program to construct such a sequence comprising <Length> samples by repeating the same random sub-sequence <Repetition Factor> times. (The length of the unique sub-sequence is therefore <Length> / <Repetition Factor> samples).

The program actually uses a length somewhat less (but no more than 10% less) than the specified maximum potential length to maximize the number of divisors of the sequence length, for reasons mentioned in McKenzie's paper. Input data may also be truncated according to the same algorithm.

Plot Windows

In what follows "N" is the length of the sub-window dividing the longer sequence according to the method, and Log(N) its natural logarithm.

After analysis the top plot window shows 4 statistics:

1. Log(R/S) vs Log(N) (black)
2. Regression line (blue) fitted to Log(R/S) vs Log(N) values (slope of the line is overall H parameter for the sequence)
3. V-Statistic (red), slope of which indicates tendency to persistence (positive slope) or anti-persistence (negative slope) of the data [see papers attached for discussion of persistence]
4. Expectancy (green), the expected Log(R/S) value if the sequence were perfectly random.

Depending on which button ("Plot Data" or "Plot H") to the left of the bottom plot window was clicked last, after each analysis the bottom plot window shows either the input sequence-vs-sample number or the instantaneous Hurst Exponent-vs-Log(N)

The following screenshots show Analyzer output for

1. 235 daily close prices for EUR.USD (stats in the top plot window, data in the second window)
2. same as the first screenshot except instantaneous H values plotted in the second window (selected by clicking "Plot H")
3. a pseudo random sequence 262144 samples in length with no deliberate repetition (i.e., Repetition Interval = 1)
4. similar to the 3rd screenshot except Repetition Interval set to 1000 (i.e., the sequence comprises approximately 1000 identical sub-sequences of length 262 samples

1. Eur.Usd daily data showing stats & price data

2. Eur.Usd daily data showing stats & Hurst exponent

3. Pseudo random sequence, no repetition

4. Repeating pseudo random sequence comprising 1000 random sequence 262 samples long

Saving Results

While there are methods to save results in the program none is enabled in this initial release.

Input File Format

The time series is input as 4 columns of floating point numbers in comma-separated-value format (i.e., as a CSV file). At the moment the program assumes the data to be analyzed is in the 4th column of the file, meaning the first 3 columns can contain dummy data, since they are not used. The columns were originally intended to contain the following info:

Column 1: Bar Interval (ticks in my case, written as a decimal number; e.g., 5400.00)
Column 2: Date (in Multicharts native format written as a decimal number; i.e., "1YYMMDD.00")
Column 3: Time (written as a decimal number; i.e., "HHMM.00"
Column 4: Price (e.g., the close)

Micheal McKenzie, gentleman and scholar, Professor and Chair of Discipline at Sydney Business school in Australia (author of the first paper mentioned in the last post) was good enough to send me the data I asked for to reproduce his results.

It turns out Dr. McKenzie's data illuminated a glaring error in the program which has been corrected but has yet to be posted. Anyone who has downloaded the code, possibly @serac and @mokodo may therefore want to hold off using it until the corrections have been published here.

An MS VS 2010 project containing the corrected program is attached, along with a version of the daily All Ordinaries data set kindly provided by Dr. McKenzie as mentioned in the last post and used to benchmark the program.

When the benchmark data set is loaded it is prewhitened according to the autoregression filter described in Dr. McKenzie's paper (equation 9 in the paper, parameter values given on page 12) and results so far appear to be reasonably close to what he obtained. I should add that while Michael McKenzie provided the data this is not to suggest he knows of or endorses what I've done with it.

As an aside I stumbled across a commercial product yesterday at Hurst Exponent And Value At Risk that claims to implement R/S analysis (among other statistical analyses) in Excel for purposes mentioned on the web page. I have no affiliation with this site and this is not a recommendation; just to say among the claims is the suggestion R/S analysis might provide a means to distinguish between trend and chop--at least in retrospect.

In other news, I've managed to reduce the balance of my IB paper account to less than half while experimenting with new volume profile and cumulative delta features of MC 8.5 beta. While blowing up a paper account experimenting may be forgivable there are signs fooling around has had an impact on my still fragile self discipline so will likely spend the rest of the week paper trading (what is mainly) BB's 5 energies system to rebuild the account and to regain my composure.

For completeness here is an experimental MC PowerLanguage/Easylanguage indicator that implements the same basic Hurst exponent algorithm as the main project (MC .PLA file attached, code embedded in the post).

The indicator calculates R/S values for 3 user-specified time scales, fits a linear regression line to the ratio log(R/S)/log(N) (where N=samples/time scale) and plots the slope of the line (i.e., the Hurst exponent) as well as individual/intermediate Hurst exponents inferred for each time scale if desired.

Notes:
1. the number of time scales was intended to be variable, whence the main loop ("for Value1 = 1 to nTs begin")
2. this code has not yet been gone over with a fine tooth comb for bugs--corrections welcome.
3. the issue remains that time scale standard deviation can be zero, more frequently for small time scales, which will cause the indicator to blow up since error checking is not done in the experimental implementation published here.

As planned trading 1/2 lot spot EUR/USD on paper most of the evening to rebuild the account and to commune with my inner trader, which mostly means playing Spider Solitaire and writing indicators while watching out of the corner of my eye for price to make a move. Managed to capture half of the transition from 00 to 50 between 22:30 and 23:15 EST after missing the start of it (too intent on Spider Solitaire at the time, which underscores the fact that self discipline has suffered lately).

Noting a couple of counter trend patterns on the 600 tick chart more often lately that I want to learn how to trade (1. retrace back to 50 after an abrupt downtrend from 00, possibly a bear flag, and 2. subsequent sawtooth possibly due to HFT program trading) between the vertical cyan lines in the following screenshot.

Still rebuilding my paper account trading spot EUR/USD and USD/JPY, last night for the most part scalping for 8 ticks or so using relatively large order sizes, rather than sitting around waiting for the long bomb (20-40 ticks) to set up. This means more frequent trades and higher commission costs, but allows me to squeeze up to 200 ticks (say) out of a session like last night with only a 80 or 100 tick range since we're trading events that occur within +/- 0.5 standard deviations of what I suspect approximates a normal distribution (of bar lengths) over time.

I have yet to write the scalping strategy into my trading plan or try to optimize it probably because I'm still a slave to indicators and what I call scalping is likely more along the lines of what Al Brooks advocates. In any event it's practically effortless, almost painless, consistently profitable and may be the direction in which my trading is evolving.

My present focus is quantifying the scalping technique and plan to publish as results become available, stats hopefully on the weekend. As you know a formal strategy depends on grasping what stats are telling us, so a method and an automated strategy are a while yet Suspect I need to finish reading Al Brooks latest 3 volumes