I would like to start a discussion surrounding machine learning and feature selection to help some of us ML beginners get a better understanding of what features and how to generalize our data for efficient ML for ML classifiers and in environments like Tensorflow.
As part of my learning process Ive taken an older automated system that I had developed and was somewhat successful prior to the markets changing and Ive modified it to extract the data that I deemed important. Generally speaking 15 bars of information, what I would use if manually trading it, taking into consideration near swing points, offset to the MAs and so on.
When extracting the data I decided to us an offset from entry point to normalize/generalize my data since that would give me all my values within +/-1. The idea seemed wonderful but the training of the various models was very poor so I changed my reference point and redid all the data extracts and various tests with different classifiers. I repeated this many times with the same results, nothing better than random chance.
In an effort to try and understand the nature of my data I have passed my data through some feature extraction methods and I discovered that, other than a few columns of data, most of what I was using was just noise.
Now here is the question for some of our more experienced ML users, what methods have you successfully used to generalization data that are not based on the raw price, meaning something like eur/us close at 1.09200, but for example 0.00125 offset from something like a 6 period SMA? Just an example LOL.