Combining discretionary trading, risk management and ML as an art

October 9th, 2017, 05:51 PM

I am learning the Weka program via the MOCC (free courses)
The program comes from a New Zealand University of Waikato.

There are 3 courses:
Data Mining with Weka
More Data Mining with Weka
Advanced Data Mining with Weka.

The purpose of this thread is to gather thoughts on doing data mining for trading which seems to be a sparsely covered topic.

Department of Computer Science
University of Waikato, New Zealand
https://www.cs.waikato.ac.nz/

Class1.2

Slides (PDF):
https://goo.gl/IGzlrn

https://twitter.com/WekaMOOC
Data Mining with Weka

Table of Contents
1. summary of findings in using datamining for predictive trading

October 9th, 2017, 06:13 PM

In the second course
More Data Mining with Weka (4.5: Counting the cost)

There is an example of a credit file with 1000 instances (this would represent 1000 people applying for loans).

The cost of a rejecting a good credit risk (a mis-classifcation) is considered to be 1/5 of the cost of accepting a bad credit risk. (so a missed opportunity is 1/5 of making a bad decision)

Using one ML algorithm (J48) without consideration to cost you could correctly select 70%
and missing a good application cost you $1000
and accepting a bad application costs you $5000
you would have total costs of $1,027,000 on 1000 applications.

(I multiplied the $1 and $5 by $1000)
--------------
If you apply a cost matrix to the prb then the ML minimizes total costs by sacrificing total correct for less bad applications accepted.

October 9th, 2017, 06:37 PM

how the above post might connect to trading:

For traders with limited capital

the cost of taking a setup that turns to a costly loss can be higher than
the cost of not taking a setup that would have given a small profit

More experienced traders realize you cannot just set a stop whereever you like eg 1:5 W:L size ratio. Where a stop can be placed is a function of the instrument the place in the price movement and many other factors.

If stops are too tight for these factors you keep getting stopped out quickly. - dying a death of a thousand cuts.

However different setups in your arsenal can have different:

expectations of working out
amount that they produce

So I see a potential analogy to the above post. The cost of losing is much higher than not winning =missing out on an opportunity.

Many losses of $500 in a row (or a high percentage of them to wins) can quickly knock out a small capital trader.
It would be like a small new startup bank with limited capital -they just can take too many losses at the start until they build up their capital. They are better to pass on doubtful setups.

Also it is psychologically expensive eroding your confidence.

October 9th, 2017, 06:40 PM

Why not add to one of the existing big discussion threads in the Elite Automated Trading section?

Mike

October 9th, 2017, 06:50 PM

So this is why I made this thread. I don't see using ML for trading as a simple throw numbers in the top and turn the crank. One needs to understand the complexity of trading to intelligently apply ML to it.

That is the reason I am working slowly through the course. I am trying to pause and think of examples of its application and mis-application. I see a danger in just rushing into applying ML without knowing the dangers and limitations.

I have already learned many valuable points:

Using ZeroR to set a baseline
Selecting a subset of attributes rather than the entire db
not using the training set for testing
how NaiveBayes can give very good results even when the assumption of Independence is clearly violated
etc

I can am starting to see the complexity of the many ML techniques.
when that complex matrix is applied to the complexity of the many, many factors that go into trading
we get a complexity to the second power.

Though that gives challenges it gives potential.

Finding the right ML techniques to complement your trading strategies will probably prove to be an art.

October 9th, 2017, 07:01 PM

@Big Mike

Thanks Mike.

The reason is in the post after yours. (you are too quick for me!). :-)

I want to stay apart from the existing threads on automation that I see being much more linear -with more depth to the discussion and not only about automated trading.

e.g,

data mining, indicators vs historical data

I have been considering launching a project to back test 20-40 or so indicators/price action vs historical forex data vs time frames to try and get an idea of which combinations are best.
I have had some mt4 EAs running on a vps for a while now so have …

"Trading System Lab is the most widely respected trading system data mining tool and is said to cost $60k/year. "

I think there is room for a thread on the intricacies of intelligently applying the factors of ML, discretionary trading, trading psychology and risk management.

However, I am perfectly OK if you move this to a non-elite trading journal if you feel this is the wrong place.

October 10th, 2017, 06:06 PM

When I looked at the credit model example I could see a parallel to trading.

In the credit model 1000 loan applications are the training set (the training set would be be like your data base (=db)of past trades for your strategy).
So this is 1000 historical setups, 1000 instances.
In the credit db 700 are good loans and 300 are bad so that is like your db with 700 setups were "good" and 300 were "bad".

For each loan you had characteristics (credit rating, own or rent, reason for the loan etc.) -these are called attributes.
For your setup you had attributes (above/below 20Ma, momentum positive or negative, etc)

Now each loan had a result (called the class attribute) repaid=good, not repaid=bad
and each of your trades has a result win/loss.

Each trade has a potential cost (as well as a potential payoff). So if you lose 5 pts on a "Bigswings" that don't work out and lose 1pt on "scaplers" that don't work out then this is the costs.

So in the lesson they are teaching that if you know your costs for each outcome and they are different (5 vs 1) then rather go for the highest accuracy in your ML algorithm you would seek to minimize the total cost. On best accuracy alone you might 70% predictive accuracy but with lowest total cost you might have 60.9%. accuracy.

October 11th, 2017, 12:01 PM

Last night I did the lesson on neutral networks. The instructor is not very impressed. He did his Master Thesis on them (at U of Calgary - which seems to have a strong ML comp sci dept), and made an improved algo. What he is impressed with is the name - thinks its brilliant! neural nets? not so much so.

In any case they were called perceptron in 1957 when first started, then fell out of favour with a paper showing their limitations and then came back into vogue when a method around the limitation (of linear boundary) was solved with the "kernal" trick. {They are akin to support vector machines which also use boundaries for classification.} They can be multilayer (hidden layers in addition to the input and output layer). Additional layers greatly increase the number of permutations and therefore computations - though may not add much to predictive accuracy in datamining. They are akin to linear regression analysis and go through repetitive learning cycles to adjust weights vectors based on the error rate (using gradient decent and later error increase to terminate the searching =epoch cycles).

So what's all this mean?

October 11th, 2017, 12:20 PM

Well not much - at least for datamining.

I take notes as I am doing the lessons (these lessons are youtube videos).

2 neutral net options
1. The Voted Perceptron (under function Classifiers) and it got 86% on the ionosphere
2. SMO is another choice and it got 88.6%

BY COMPARISON two other algos that are not NN:
1. Logistics = 88.9%
2. SimpleLogistic = 88.3%

So on that db the 2 nn algos didn't perform better than the Logistics algo.

------------------- my takeaway ------

I have noticed in trading many people are attracted to complicated sounded things. So if their indicator was called "heuristic adaptive quantum stochastic" indicator -- boy it must be good !

but a brilliant name that conjures how sophisticated one is seem to pander more to one's ego that to results. Certainly Kiss can't be over-applied to ST day trading for there is complexity in the markets. However, watching others via journals etc - one of the biggest dangers to a trader is too much ego. (BTW I'm not calling confidence, ego)

Especially for new traders jumping to complexities before getting a sold handle on basics (S/R, double tops etc) can be a recipe for account blow-up.

Combining discretionary trading, risk management and ML as an art

Discussion in Emini and Emicro Index

Combining discretionary trading, risk management and ML as an art