This is my first post. I have been experimenting with using synthetic data to train and optimize my trading strategies. My hope is this will significantly reduce over-fitting since the optimization is no longer done on historical data alone. Here is an outline of my process.
Load in-sample data
Generate synthetic data from in-sample data
Optimize trading strategy on synthetic data
Run strategy with optimal parameters on out of sample data real data
For each symbol, I generate 100 different sample paths or data sets from the in-sample data. The results so far have been very promising. My walk forward efficiency has improved dramatically in many cases more than double compared to traditional WFO. My in-sample performance is much more realistic and my out sample returns are better because the parameters are more stable.
The biggest drawback is this entire process is really tedious. I have to generate synthetic data for each in sample data block to avoid any look forward basis. There is also constant switching between Matlab and Multicharts.
I load my in-sample data into Matlab and use it to create my synthetic data. Matlab then exports the newly created data to text files which I map back to Multicharts through ASCII mapping. I use the portfolio back tester to get the optimal trading parameters and then I run those parameters on my out of sample test.
First, I was wondering if anything I can do to make this process less painful. Would I be able to automate this process if I switch to NinjaTrader? Second, I want to improve the way my synthetic data is created. I believe creating realistic synthetic data is essential. If anyone wants I can post the Matlab code that I'm currently using to generate the data.