STF discretionary spot Forex system development journal

November 11th, 2011, 06:28 PM

Thanks for the paper @sam028. It raises a few issues that should probably also be discussed here when the time comes, like choice of membership function (the authors use a Gaussian mapping, rather than triangular, and I plan to use trapezoidal rather than triangular since that's what's available in the 3rd party library). In the best of all possible worlds the shape of the membership function should be determined empirically but at this point I suspect any effect would be at best 2nd order.

The authors also mention their "convergence" model without giving details (the means by which fuzzy values input to the inference engine are derived from observations), the means by which the inference engine processes propositions ("rules") without elaborating and how they combine & defuzzify proposition outcomes to generate orders (Mamdani min/max implication), again with no discussion when the choice of fuzzification method & implication operator can in theory have significant impact on the accuracy & behaviour of the system.

It's interesting to note their choice of indicators for input to the system (MACD, CCI, RSI, Bollinger Bands) accounts for momentum and cycle ("spectral") characteristics but IMO there is no real measure of trend or attempt at inter-time frame confirmation. Like most autotrading strategies no doubt the system works when the market cooperates but tends to bring to mind Samuel Johnson's remark, to paraphrase ""...[a machine trading] is like a dog's walking on his hind legs. It is not done well; but you are surprised to find it done at all."

I spent the time since my last post dumping indicator outputs for EUR/USD 200, 600 & 1800 tick charts (various parameterizations of S- & EMAs, MACD & Stochastics) and their first and 2nd differences (slopes & accelerations) and staring at plots of them in Open Office Calc first to make sure they're usable, second to see how they interact and finally to define membership functions. My next post will likely be to summarize the membership functions and show how they're coded in Aforge.Fuzzy. After that the issue will be implementing the trading rules as "fuzzy propositions"--the meat & potatoes of any strategy.

Edited to add: a decent discussion (i.e., adequate, plain language, math doesn't make one's head spin, lots of examples) of fuzzy concepts is located here: PowerZone: Fuzzy Logic Module. It's for control applications rather than for trading but the principles are the same & are straight forward to extrapolate.

November 13th, 2011, 09:17 AM

Just a quick post to reflect on using frequency binning (histograms) to determine extrema and ultimately fuzzy membership functions (mappings of input values onto fuzzy sets, or collections of labels).

Histograms also provide a basic means to estimate probability density function (PDF) and in general histograms of values of Close price produce non-unimodal distributions that e.g. @Fat Tails refers to as "time-Price opportunities" from which TWAP (time weighted average price) & POC (point of control) can be estimated. Histograms of SMAs of Close price are also in general non-unimodal, muted replicas of the Close price histogram, which might be expected since SMA values are essentially low pass filtered versions of price.

Histograms of first and second differences (i.e., slopes and "accelerations") tend to produce unimodal, practically "normal" distributions.

In contrast it turns out the histogram of MACD values themselves is also approximately unimodal whereas the histograms of Stochastics K & D values are significantly bimodal (Stochastics D a muted replica as might be expected), in the latter case in fact reminiscent of the PDF of a sine wave (see e.g. Atif's Blog: Probability Density Function (pdf) of Sine Wave). Even though MACD is sometimes referred to as a "momentum oscillator" the fact that its estimated PDF is characteristic of a random distribution suggests Stochastics is a true(r) oscillator and an adequate proxy for the notion of Cycle in our model.

Graph of MACD histogram for 600 tick Eur/USD (24106 sample population):

Graph of Stochastics K histogram for 200 tick Eur/USD (33005 sample population):
I suspect the spike at x=50 is real, due to a tendency of Stochs to flatten from time to time, rather than a sampling artifact.

Graph of Stochastics D histogram for 200 tick Eur/USD (33005 sample population):

Graph of histogram of one cycle of sinewave:
Artifact (drops to zero) in centre of plot is not real, due to histogram bin size

Further to the topic of PDF estimation, a somewhat more elaborate approach is "vector quantization" (VQ) used in lossy compression of sound & images. A variety of VQ employs competitive learning, so called, and is the basis of "learning vector quantization" (LVQ) which in turn is at the root of the "self organizing feature map" (SOM) approach to pattern recognition. SOM and other artificial intelligence (AI, essentially neural network based) methods have been used for financial market prediction for a long time (e.g., "Predicting Stock Prices Using A Hybrid Kohonen Self Organizing Map (SOM)", by Mark O. Afolabi and Olatoyosi Olude in "Proceedings of the 40th Hawaii International Conference on System Sciences - 2007", $30 from IEEEExplore, which IMO unfortunately reads like it was written by a couple of 3rd year engineering students who knew as little about financial markets as they did about AI :-/).

While this association with neural networks may be interesting & inviting (at least to me), it's also a warning not to let development of a potentially useful way to manipulate trading heuristics via fuzzy propositions devolve into yet another AI blind alley

In any event the AForge.net library also contains a neural network modelling core, including SOM methods.

If nothing else, to the extent histograms used to parameterize fuzzy membership functions may also estimate PDFs well enough to calculate the probability price action will generate a given linguistic, we should hope to calculate the probability price action will cause a given proposition to execute (proposition execution probabilities some combination of probabilities associated with fuzzy input parameters comprising the propositions), proposition execution eventually resulting in buy/sell orders, and thereby estimate the strategy's level of trading activity. This is approximately what the inference engine already does (compute proposition outcomes in terms of likelihoods from its components).

By the same token, ideally we might attempt to "build backtesting in" to the strategy by examining the same price data set used to parameterize the membership functions, looking for a way to estimate the probability a given proposition will "succeed" ( i.e., that any order derived from the proposition's predicate will meet its targets), and hence "build in" some idea about profitability.

At first glance this might not be as nebulous, far-fetched or time wasting as it might seem since propositions here simply embody a system's trading rules, which in turn presumably are determined from some number of observations of live or historical price data, from which setups are deduced because they are seen to work 2 or 3 times and we conclude optimistically "maybe they work most of the time". Back- and forward testing, construed as a separate step in system development, tempers our initial optimism by helpfully ferreting out all the occasions the proposed setup fails.

What might be missing from the plan so far therefore is an empirically constrained element of prediction--supervised training, in other words. The issue I'm grappling with may be whether it's possible to incorporate training into the fuzzy system directly from data used to design the membership functions--particularly output membership functions--without resorting either to explicit backtesting or some neural AI technique (a slippery slope IMO that should be approached cautiously).

Edited to add: I've attached the code--a strategy called "TDWriteSMASlope"--I'm using to dump data for histogram calculation in case anyone wants to experiment. It's configured as a strategy because I find it more convenient to work from strategy analyzer than attach an indicator to a chart. Should import into NT 7, no need to install indicators included (generic to NT), creates a .CSV file for import into your favourite spreadsheet program but no guarantees because it's is quick, dirty and unoriginal (borrows features from other published code ..... thanks for the file write snippets @Ducman

). The only custom parameter for the strategy ("Collect By") can be "Daily", "Monthly" or "OneFile" and determines how much data is written to a single .CSV file. For "Daily", 3 files are written to a different folder in the NinjaTrader.Cbi.Core.UserDataDir + @"\Preprocessing\ folder (typically C:\Users\Owner\Documents\NinjaTrader 7\Preprocessing\) for every day, Instrument, PeriodType and Period (e.g.,EURUSD_Minute_5_values.csv, EURUSD_Minute_5_slope.csv, EURUSD_Minute_5_accn.csv) between the selected start and end of the strategy Time Frame. Any folder that doesn't exist will be created. The 3 files contain bar-by-bar data for Close price and values, slopes and accelerations for 3 SMAs, 1 EMA, MACD, MACD.Avg and Stochastics.K and Stochastics.D. Similarly "Collect By" set to "Monthly" will write 3 files to a different folder for every month, Instrument, PeriodType & Period in the Time Frame, and "OneFile" writes 3 files to a different folder for every year in the Time Frame.

Directory structure for data dump "Collect By" parameter:

November 13th, 2011, 11:03 AM

bnichols

Thanks for the paper @sam928. <....snip....>

Not sure why I can't edit post #21 to correct your mention @sam028, so I'll quote it instead.

January 20th, 2012, 08:49 PM

Preamble

Since I last posted I've been manually paper trading spot EUR/USD using the system under development almost every waking hour, sleeping 4-6 hours at a time as required, studying the way selected indicators react to as many variations of price action as possible. While the market is capable of a very, very large number of variations at long last it seems patterns I've never seen before are becoming rare.

As the system slowly becomes "second nature" the number of successful trades increases and so does my confidence. At this stage the market has become familiar, friendly & fun rather than scary & inscrutable. That said, it seems we approach "consistent profitability" exponentially, by repeating mistakes until we finally learn from them and by reinforcing behaviours that work. As a general rule we increase the number of units traded from a bare minimum as confidence increases to the limit of what money management allows, which is where the joy of trading starts to kick in.

I still believe we can't create a bot that makes money if we don't know how to make money, any more than we can program a bot to solve a problem if we don't understand the problem. Computers are good for problems than have numerical (non analytic) solutions, as I believe trading to be. Which perhaps paradoxically is why non-analytic humans can be so good at it.

To reiterate, "the system" is essentially Barry Burns' ("BB") "5 energies" system (topdog.com). This is not a recommendation or an endorsement of this particular system, since IMO it is simply yet another repackaging of sound trading principles common to any system. I fully believe any legitimate system will make money once you've mastered it (not necessarily a tautology). The whole issue is mastering the system, which means learning how to trade, and in the process mastering oneself. For the purpose of this exercise BB's system provides the definition of a "vector" for implementation as a bot, as described below.

The cost of shifting my internal clock to trade both the London and New York sessions is substantial and more than ever like a lot of traders I'm looking forward to creating a bot from what I've learned, so that e.g. it can trade London while I trade New York.

Work in Progress--Fuzzy Decision Making

It's one thing to fuzzify inputs but quite another to design a fuzzy buy/sell/sit-on-your-hands decision process.

To recap, we've selected a set of indicators we hope are more or less necessary and sufficient to defining setups and generating order signals--in this case essentially MACD, Stochastics and 4 moving averages, plus some definition of S/R levels--in the usual 3 time frames ("fractals" in BB's terminology). What needs to be made explicit is what to look for--what are the significant combinations of values and rates-of-change of the indicators, and mainly--where are the S/R levels and their strength.

In terms of bot programming a "significant combination of values and rates-of-change" is simply a set of vectors that are likely to cluster between buy and sell signals.

A convenient way to define such a vector is by capturing indicator values at "significant moments". Assuming a "significant moment" is the point at which in hindsight we wished we'd bought or sold, then one way to identify some is by dumping values at peaks and troughs of e.g. the ZigZag indicator during a backtest, given the usual caveats of backtesting (past results etc., which is where exhaustive experience comes in).

Anyone who's used one variation of ZigZag or another will understand it matters what value we choose for the deviation (minimum pips for a zig or a zag). This matters to the algorithm under development as well and at this point needs more study to control profit expectation for a given setup.

To this end I rewrote the strategy mentioned previously, whose sole purpose is to dump data during a backtest, to include values of the ZigZag indicator. During proof-of-concept indicator value sets ("vectors") at ZigZag peaks and troughs were isolated in a spreadsheet, but since then I've written a standalone C# class to extract these vectors from NinjaTrader strategy output directly.

At this point the idea behind identifying "significant vectors" is to determine their centroids in so-called "N-space" (which we assume at first is Euclidean). I naively suppose I can write an indicator to show where price action is relative to these buy/sell centroids once they've been established (a process that amounts to backtesting and hence is worth as much as backtesting) as an aid to manual trading. While such an indicator might be a trading aid (i.e., to help overcome any residual tendency to trade impulsively) the main purpose is to design a decision-making core for a bot.

Again, success of the approach depends on the extent to which the chosen indicators are necessary and sufficient; in other words, the extent to which (narrowness with which) vectors cluster.

Where Fuzzy comes in

The N-space cluster has an associated distribution function, distance and slope of which can be assigned linguistic descriptors and thereby made part of the usual input-inference-output process.

Outputs

I've written a standalone program (C# using Visual Studio 2010 Ultimate with Net 4.0, using ZedGraph for visualization if you want to prepare) to perform K-means clustering, mainly to extract centroids and am working on a program to reduce raw data collection strategy output to clustering program inputs. Will publish the programs when they're stable. Still trying to figure out interesting anomalies that seem to become apparent with visualization, hence the importance of visualization. At this stage the question is, are the anomalies real and therefore useful for trading algorithm design, or is there a mistake in the N-dimensional clustering program.

Conclusion & prognosis

Programming is more fun than trading but at the end of the day I like a program that makes money. I'm optimistic about results so far.

February 3rd, 2012, 02:16 AM

Preamble

Since the last post I've been both trading as much as humanly possible and working on a Visual C# program to perform a number of statistical analyses of price data to determine the feasibility of writing a bot to trade the "5 energies" method.

I can trade while drinking alcohol, after being up for 24 hours and essentially out of the corner of my eye-- while watching TV or playing Spider Solitaire and still retain the necessary focus (based on cumulative profits) but yet still have episodes in which I can be wide awake & seemingly not distracted and yet lose my feeling for price and continue to trade--what I consider my worst habit: trading without "feeling price". If bad trading habits die hard then they are also prone to coming back from the dead.

The screenshots below summarize progress so far: namely,

1. importing feature vectors generated by a NinjaTrader strategy, 1 vector per bar over a given time period (as previously described)
2. creating buy/sell clusters (hopefully) from the data by filtering on ZigZag indicator extrema
3. applying K-Means analysis to extract bounding polygons & centroids of clustered data
4. calculating density functions for the clusters to act as fuzzy "shape functions"
5. visualizing the above

I prefer this approach to regular backtesting because it's just as quick and probably more efficient; by virtue of filtering by ZigZag extrema the data can be made self organizing.

In theory basing buy/sell decisions on the covolution of an instantaneous feature vector with a density function derived from clustered buy/sell population data (vectors prefiltered by ZigZag extrema) should produce a more robust strategy than a ZigZag indicator-based strategy alone.

Continuing Work

At the moment I'm still contemplating the most efficient way to implement the shape functions in a bot but have pretty much ruled out neural nets & some derivation of Self Organizing Maps, the rule of thumb being that if the goal can be met by a "conventional" computer algorithm then it should be done that way (net design in particular having a way of becoming frustrating & time consuming). I've also ruled out messing with data in higher dimensions than 3.

Therefore the approach being actively pursued at the moment involves determining the probability that a given feature input vector signals a buy or sell

1. first, by convolving a real-time feature vector in some sense with each of 190 x 3 = 570 predetermined 2-dimensional density functions corresponding to the same feature population clustered 2 at a time ("20 choose 2") for 3 fractals (1800, 600 and 200 ticks).
2. second, by estimating a probability from the results by Mamdani Min Implication (a standard approach to reducing fuzzy propositions)

To the extent the approach is feasible (and it's not clear yet that just because processed data clusters that it's useful) then some means has to be implemented to determine stops, profit targets and trailing stop behaviour but the following assumptions are made:

1. initial stops ought to be placed according to usual criteria (vicinity of a convenient S/R level above or below entry that also meets money management rules)

2. profit targets set at S/R levels consistent with the profit cluster(s) generating the highest probability buy/sell signals

3. trailing stop behaviour controlled by the trajectory of the instantaneous feature vector (vector generated at the close of the last bar) within the various density clusters.

Screenshots

Notes:

1. Data shown below (raw data, sample of unfiltered and 570-such filtered clusters) is derived from EUR/USD for the last 3 months of 2011, each analysis comprising 14,000-120,000 vectors depending on the fractal shown (1800- , 600- or 200-tick-based charts respectively)

2. Data shown has not been cleaned to remove outliers, as indicated occasionally by the extremely wide enclosing polygon boundaries.

3. It can be seen that in general the data clusters, which at first glance might seem to imply it should lend itself to buy/sell prediction (and AI methods in general), but in reality at this stage means further analysis is required to determine what part of the feature vector population not associated with a ZigZag extremum also lies within the cluster and hence whether the cluster predicts anything at all.

4. While the center plot (Parameter Chart) shows both clusters on the same X-Y axes and hence on the same scale and proper relative position, the Density Function plot scales and cluster position are relative to min & max of the respective enclosing polygon shown in the Parameter Chart.

5. Respective centroids are indicated by small black discs near the center of each buy/sell cluster

Figure 1. 118,000-vector STF Fractal (200 Tick bar) StochsD/MACD feature population in an 8-pip profit cluster (i.e., data filtered to show vectors that preceded at least a subsequent 8-pip price movement)
Non-close fitting enclosing polygon in the Parameter Chart indicates noise spikes that are naturally suppressed in the Density Function, which at this point is essentially a low-pass filter

Figure 2. LTF Fractal (1800 Tick bar) StochsDSlope/StochsD feature data in a 32-pip profit cluster (i.e., data restricted to vectors that preceded at least a subsequent 32-pip price movement)

Figure 3. LTF Fractal (1800 Tick bar) MACD/StochasticsD feature data in a 128-pip profit cluster (i.e., data restricted to vectors that preceded a subsequent minimum 128-pip price movement)

Figure 4. LTF Fractal (1800 Tick bar) unfiltered/unclustered StochsD/StochsDSlope feature data (approx 15,000 vectors)

Figure 5. LTF Fractal (1800 Tick bar) StochasticsD/StochasticsDSlope feature data (data from previous Figure 4) in a 128-pip profit cluster (i.e., data restricted to vectors that preceded a subsequent minimum 128-pip price movement)

Figure 6. MTF Fractal (600 Tick bar) unfiltered/unclustered StochsD/MACD feature data (approx 43,000 vectors)

Figure 7. MTF Fractal (600 Tick bar) StochasticsD/MACD feature data (previous data from Figure 6) in a 16-pip profit cluster (i.e., data restricted to vectors that preceded a subsequent minimum 16-pip price movement)

Figure 8. MTF Fractal (600 Tick bar) StochasticsD/MACD feature data (previous data from Figure 6)in a 32-pip profit cluster (i.e., data restricted to vectors that preceded a subsequent minimum 32-pip price movement)

Figure 9. MTF Fractal (600 Tick bar) 15EMASlope/200SMA feature data in a 32-pip profit cluster (i.e., data restricted to vectors that preceded a subsequent minimum 32-pip price movement)

Figure 10. MTF Fractal (600 Tick bar) StochsD/StochasticsDSlope feature data in a 32-pip profit cluster (i.e., data restricted to vectors that preceded a subsequent minimum 32-pip price movement)

Figure 11. MTF Fractal (600 Tick bar) StochsD/StochasticsDSlope feature data in a 128-pip profit cluster (i.e., data restricted to vectors that preceded a subsequent minimum 128-pip price movement)

February 3rd, 2012, 04:13 AM

Spectacular!

Mike

February 4th, 2012, 03:16 PM

Thanks Mike

A few comments while I decompress, glass of rum in hand--not the first, lit cigarette on the lips, bearing in mind retail currency spot market reopens in less than 24 hours. The density plots in particular are easy on the eyes but as you might agree we won't be too impressed until the thing makes money, and many miles to go before we sleep easy in that regard. Which I suppose should prompt a caveat to the newbie quant trader: namely, IMO one ought to view statistics--no matter how lovely to look at--the same way we scrutinize a prospective spouse: skeptically, proof always in the home-made pudding if not the fact s/he's self supporting, forgives our mistakes. Don't chase algorithms any more than others chase indicators. IMO the path to profitability starts and ends with manual trading by the seat of your pants, as often as possible--only way to rewire your neurons to make money while you're awake and your wits are about you, and to write code that makes money while you sleep (with one eye open). First and foremost quants need to hearken back to your first profitable algorithm always, the same way you rely on your original spouse. Assuming you're still married to your first spouse, in which case if you're not there may be trading issues to overcome that this thread may not address, since even a booty-shaking, money-making program can lose in the wrong hands

To this end, on the question of where do feature vectors not corresponding to price extrema lie relative to the buy/sell (extrema) cluster centroids, overnight I added an "auxiliary vector" to each feature vector that simply records the "distance" in the time dimension of any vector from the vector corresponding to the last extremum, hypothesizing (i.e., "hoping like hell") that between extrema (i.e., while price is trending from entry toward a profit target) feature vectors would be well behaved. In other words, to see if feature vectors between price extrema would move predictably (more or less monotonically) from one cluster toward the other as price moves from entry to profit target, rather than (say worst case) leaping between clusters in a binary fashion. Practically speaking this analysis aims to quantify feasibility the algorithm can duplicate the trading rule of thumb, that if we miss the initial entry it may not be too late to participate if we lower profit expectations and be extra careful about placing the initial stop (i.e., assuming in general such a stop is going to introduce relatively higher risk than stopping the ideal entry). By the same token the analysis should also inform trailing stop behaviour, both goals requiring the vectors prove well behaved.

One problem I still have is allowing too much weight to the rule of trading "Don't let a profitable trade become unprofitable" and hence still too often getting trailing-stopped-out early, despite the rules of the method effectively relying on stops to put money in the bank rather than targets, to the detriment of cumulative profit. Effective management of trailing stops is crucial (if still considered more an art than science) since weak management tends to nullify impact of the high probability with which some number of us (most?) can pick S/R levels most of the time & (perhaps to a lesser extent) the likelihood they will breach, extreme vagaries of short time frame spot currency aside, so I'm looking forward to a little clarification

It could turn out e.g Fibonacci discovered the rules of trailing-stop-management a long time ago (funny how exhaustive analysis often just proves what other folks take for granted), which would take a load off the CPU.

So far cursory visualization of the behaviour of these intermediate vectors looks good for selected features that we know to be highly correlated with price movement, like MACD and Stochastics. We expect increasing divergence from a straight line path between buy/sell clusters in first and second differences (slopes and accelerations) but it's hoped e.g. the statistical variance will lend some idea of how much price can be expected to wiggle between order entry and profit targets, hence provide an input into the trailing stop control. If nothing else variance may provide confidence weighting factors when combining probabilities of noisy density functions with more stable density functions. BTW, what separates born quants from normal, well-adjusted traders (assuming there is such a thing) is--quants stew over the fact every wiggle in a trend trade represents $$$ left on the table.

ETA: On a practical note, making the latest changes introduced what appears to be a harmless bug into the software, which nevertheless has to be tracked down (frustrating because development stops until it's repaired, for as long as it takes); one is never 100% sure if software (let alone statistics) show the "truth" or simply reflect our expectations, more doubt when they misbehave in seemingly inconsequential ways (being a true believer that the devil lurks in the details) :-/ In that regard we ought to note EUR/USD was trending down overall during the last 3 months of 2011, meaning the current trial population itself likely includes a large wavelength bias that still needs to be accounted for.

February 6th, 2012, 04:31 PM

Bug fixed with no change in the prognosis for the method. An analysis of intermediate vectors (snapshots of parameter values between price extrema) indicating that they are well enough behaved, thoughts are turning toward implementation of a probability density function based mapping from parameter values to fuzzy variables. At this point therefore things get a little more technical.

It turns out after an actual parameter count the length of the feature vector being considered is closer to 30 than 20 and in danger of growing. The effect of this is to increase the number of density functions from 190 x 3 = 570 (20 choose 2 x the number of fractals) to 435 x 3 = 1305 (30 choose 2 x 3). Since a value must be retrieved from each 2D density function for each parameter pair bar by bar in real time while the strategy is running, the larger the number of functions the greater the load on computer resources. If in addition we want the system to learn (in this case, to update the density functions on the fly) then whatever means we choose to store & access system structures ought to be malleable.

Rather than make the effort right now to try to prune the list to a sensible number, or to reexamine the approach itself in light of the fact it might be unwieldy , I've instead decided to evaluate methods that might reduce the complexity (or at least deal with complexity effectively).

We're using 2D densities pretty much only because higher dimensions are more difficult to visualize, and IMO visualization is important. We've also been sticking to 2D partly because some math (e.g., cross product, if we needed it) apparently breaks down in higher dimensions, except e.g. for the 3rd and 7th dimensions. I have no clue what the significance of that is and at the moment no inclination to pursue it.

That said, it may be possible to exploit the multidimensional nature of the data without resorting to too much hyperspatial gymnastics if all we're doing essentially is data retrieval.

The purpose of the 2D density function (visualization aside) is ultimately to associate a probability that instantaneous values of 2 parameters (e.g., MACD and Stochastics) lie within a buy or sell cluster. Boiled down, right now the 2 instantaneous parameter values index a table, whereas from a broader perspective we'd prefer to index an N-dimensional table with a vector comprising N parameter values. While as mentioned previously we don't want to wind up lost in hyperspace or impaled on c# pointers, methods for collapsing multiple dimensions to a single dimension for the purpose of data storage and retrieval are well known (e.g., helical hyper-spatial codes, related to Peano codes, Hilbert curves and space filling curves in general). My initial exposure to these was in the early '90's, rubbing shoulders with a colleague who was helping Oracle develop their multidimensional search core, subsequently embodied in Version 7 I think.

In other words, if we can come up with a suitable key/value structure to encode all 1305 (or more) density functions in a single table then it may be possible to use C#'s Dictionary function to perform the correlation in "one" relatively fast operation. (I'm considering the Dictionary function only because I've read it's faster than the C# hash table implementation).

We expect the size of the dictionary to increase as the square of table size (i.e., with parameter resolution). For example, right now we're using a 100x100 array to store each 2D density function, which it's hoped adequately resolves parameter values, but otherwise means each index requires 7 bits--essentially one byte of memory before compression. If an index requires one byte then in theory we could increase density function resolution to 256x256 density values over the same parameter range, from 100x100, without affecting how we handle indices, but by so doing we increase dictionary size from 100x100x1305 = 13 million to 256x256x1305 = 85 million key/value pairs.

I think the trick to exploiting the N-dimensional nature of the data at this stage is simply developing an appropriate dictionary key, which amounts to choosing the right space filling curve to act as index. Specifically, we want the index to move through the table in such a way that adjacent points in "N-space" are also adjacent in terms of the index, implying among other things that the resolution with which we probe the N-dimensional probability function is determined by the resolution of the index. Related to this is the fact the density functions are sparse (mostly empty space, zero values). In this case it may be possible to view the structure as an N-dimensional bitmap and therefore accessible to available image and sound compression techniques.

In any event practical implementation of such a key structure is straight forward--far easier than this cursory description might imply, so looking forward to concrete results soon.

ETA: first attempt to implement the key structure will rely on the availability of BigInteger for Net 4.0 in VS 2010 (System.Numerics namespace) to create a Hilbert curve key structure, with the Dictionary definition

Code

public Dictionary<BigInteger, double> dictionary = new Dictionary<BigInteger, double>();

and simply interleaving bits from all density function table indices to create N-dimensional Morton numbers (so called) as the key.

While there are other approaches, BigInteger is convenient because e.g. 30 parameters will result in manipulation of 30*8 = 240 bit keys.

ETA: while it's possible to stoop to one of a host of algorithms on the internet to perform a brute force calculation of Morton number given an indeterminate number of dimensions at the moment I'd prefer to generalize the "shift and mask" algorithm only because it's possibly more elegant. Therefore the task at hand appears to be generalizing calculation of the masks to multiple dimensions. Figuring out bit shifting is not my strong suit so if any reader has done this before please feel free to share

The good news is the Net 4.0 implementation of BigInteger in VS 2010 appears to be robust.

February 9th, 2012, 05:53 AM

Since the last post I decided that (regarding the bit interleave algorithm) elegance requires more work than brute force, is unwarranted this early in the game at the stage we're still trying to prove the concept, so that brute force is good enough--save elegance for the finished product. At least this is the principle IMO behind "canonical form" in which e.g. math treatises are published, by which an author sweeps all clues as to an idea's humble beginnings under the rug in order to give the impression it sprang as is, full blown from his or her brow and to leave his or her readers completely mystified about how it stems from or applies to anything practical.

In that respect, this is the working Morton number (bit interleave) code in c#. It assumes byte array "a" contains nb parameter coordinates corresponding to a point in N-dimensional parameter space and assumes C# operator precedence; i.e., bit-wise logic ("&") executes before left-shift ("<<") and arithmetic functions ("+", "*") before left-shift ("<<"):

Code

        private void interleaveN(byte[] a, int nb)
        {
            for (int i = 0; i < 8; i++) // unroll for more speed...
            {
                for (int j = 0; j < nb; j++)
                {
                    bKey |= (a[j] & 1U << i) << j + i * (nb - 1);
                }
            }
        }

In other news, while looking for a bug in the code that calculates price variance between entry and exit to guide trailing stop behaviour I was forced to reexamine how I was using ZigZag data to extract buy/sell vectors from the parameter data.

To make a long story short, the ZigZagHigh & ZigZagLow dataseries being used are "proto-series" in the sense they don't directly define local tops and bottoms of price movement, but instead record candidate tops and bottoms based on price action as interpreted by the ZigZag indicator. The ZigZag indicator then applies an algorithm to the 2 dataseries to synthesize the familiar zig-zag pattern.

Cluster and density plots posted previously show parameter vectors extracted at discontinuities in the raw ZigZag high & low dataseries--no further processing applied--with the effect that the buy/sell clusters mix long/short entries respectively with what I choose to interpret as candidate intermediate profit targets between entry and exit (the intermediate profit targets being points in the ZigZag dataseries where price consolidates, or undergoes a mini-retrace).

In other words, a lot of the points in the "buy clusters" for various parameter pairs are not strictly characteristic of long entry conditions, but instead characteristic of conditions in which in the very least a short trade should take profits. By the same token, a lot of the vectors in the "sell clusters" are not strictly characteristic of short entry conditions, but instead characteristic of conditions in which in the very least a long trade should take profits. In a sense this is good news, since if the method will work at all these restrictions further reduce ambiguity by limiting the range of conditions characterizing, perhaps even defining, permissible entries and advisable exits. This development embodies the trading rule, "Trade less--lose less."

We may surmise that remaining vectors, the majority of vectors in each cluster that don't correspond to entries or profit targets, might be deconstructed into "do nothing" or "consider adding to the position" conditions.

I expect to post plots of augmented density functions to show the effect of these changes once they're available.

Thus while additional processing (mimicking the algorithm in the NT indicator that draws the actual ZigZag pattern) has been required to separate vectors signalling actual long/short entries from sell/buy profit-taking in the opposite trade (consolidation following short/long entries) but in general characterizing price behaviour at S/R levels that may or may not be explicitly present in the parameter vector (and hence probability density) in the form of Previous Period OHLC, Murrey Math & pivots, overall it's still true for example that the closer a "buy" parameter vector approaches the centroid of the Sell cluster, the more conditions represented by the vector reflect a Sell entry condition within the context of the selected ZigZag deviation. To the extent these non-extremum discontinuities in the zigzag function can be construed as S/R levels not contributing directly to the probability function, they inherently dissect the density function beyond simple clustering, therefore still may contribute to decision making.

Finally, in still other news, I've decided rather than get bogged down unnecessarily trying to deal with data compression it's probably sufficient simply to populate our dictionary structure only with points in the density functions that have non-zero values, taking advantage of inherent benefits of the structure in C# (e.g., translating null returns to zeros if the need arises) instead of trying to enforce too much pseudo mathematical rigour.

The situation is depicted in the figure below, which shows the 1800 tick ("LTF fractal") chart for EUR/USD between approximately 10AM AST December 23 and 6PM AST December 30 in the top half of the illustration and corresponding 32 point (pip) deviation ZigZagHigh and ZigZagLow dataseries superimposed on a line chart of closes in the lower part. (Note: all indicators shown as well as their 1st and 2nd differences are used to parameterize the method. The ZigZag indicator--thin blue line on the top chart--is used as a filter as described during parameter vector pre-processing and does not appear explicitly in the eventual N-dimensional probability density). As a point of interest the 2 dataseries synthesize a channel resembling other familiar "persistent extrema" indicators (e.g., Donchian Channel).

The annotation in the bottom part of the illustration highlights one hypothetical short trade (straight, diagonal blue line marked "Trade") as defined by raw ZigZag dataseries.

In summary, the system clusters parameter data for 10's of thousands of such potential trades from historical tick data in the hope of assigning a probability to patterns--essentially the same as training a neural net without the headache of developing a suitable neural net architecture.

February 10th, 2012, 08:08 PM

Doing sanity checks on the data, software, outputs for the last 24 hours so not much to report.

A first crack at the dictionary (BigInteger) based probability density encoding method produced mixed results. The algorithm used to produce keys is creating occasional duplicates--definite no-no for what should be a one-to-one mapping. Checking indicates the code is fine so the problem is very likely a logic error and if so I think I know where it is but need a clearer head to decide what the impact is and how to fix. Biggest fear is I've overlooked something fundamental that will send me back to the drawing board. The amount of dead code in the project increases steadily but too soon to prune.

Once issues with the encoding are worked out the next significant push will likely be to interface the dictionary with NT via a DLL, at this stage assuming the strategy will pass the current parameter vector to it as an array. At that point we'll have the first indication whether the method is going to work.

I traded a little on paper today to stay current, lately too distracted by system development and impaired by lack of sleep to trade real money. Although still evolving, even in this condition am confident and have little trouble breaking even no matter how much I trade. At the top of my game I make money so remain optimistic. It still appears the only difference between losing and consistent profitability is experience, and sticking to the letter of whatever method one has adopted.

My attitude toward losses, and hence stops, is in a state of flux. These days I set small stops (3-4 pips) and I have no problem taking a (small) loss because it seems to me there is no practical difference between a loss that is not yet realized and one that has been realized. Overall, commissions pale beside the cost of remaining in a losing trade too long. We not only waste time waiting for the market to "turn around" if it comes back at us but worse, while we remain in the market we tend not to be entirely objective about conditions, nor does it seem as easy to see pending setups that either confirm we were probably right in the first place or waken us to the fact we were wrong. Doubt or confusion is probably the best & earliest indicator the market has become choppy relative to our trading method, and one losing trade is the signal to sit back and take stock. The bottom line is while we are in a trade our account balance reflects what price does, for better or worse, and while we are not we are immune to what price does--one ought not take profits & losses personally, but we do.

That said, one change I made some time ago that may be obvious to experienced traders and that tripled profits happened after I noticed 45 of 50 ATM trades were exiting at the trailing stop (I've come to hate the words NT speaks, "Stop filled"). Rather than simply advancing the stop after some price advance, or after price cleared an S/R level--essentially hoping for one long-bomb trade, conserving the entire position until the bitter end--instead I now both take some profits and advance the stop when price advance pauses for one reason or another. In particular this avoids the hateful situation I've come to call "boiling the frog"--when after we've been up (in my case) 35 or 50 pips price slowly and inexorably retraces to take out the original entry point. A "small loss" in this circumstance is inexcusable.

STF discretionary spot Forex system development journal

Discussion in Trading Journals

STF discretionary spot Forex system development journal