I originally posted this in response to some questions asked in Kevin's thread but felt it wasn't really appropriate there. So, I've move it here where we can discuss some ideas I have regarding WFA and my perspective on answers to some questions. The terms I discuss here are mathematical, however, it is more of an intuitive sense to me as my mathematical background is limited-- many stats courses but not much higher math.
If you see changes in WFA then it is probably due to what I will call higher dimensional hyper parameters. If the WFA has any sort of cyclical structure then it can be viewed as a static form in a higher dimensional mathematical space, I suspect. The WFA analysis becomes a type of third hidden parameter. Whether or not this helps or hurts your profits depends on whether the hyper parameters are stable or not. If you want to see how this could work, imagine a simple system with only a single parameter that cycles between 0 and 1 with perfect cyclical duration. If you see the system in higher dimensional, hyper space, you will see it as a static system with no optimization but you can also see it in a lower dimensional space using WFA. Detecting these cyclical components could allow you to maximize returns in a way that is not possible in lower dimensional space. However, you can also see the WFA as yet another parameter which allows you to chop up the equity curve. As for whether WFA or static analysis (longer backtest histories) will work better is a function of the stability and variability and critically the selection criteria.
Going back to selection criteria, the typical WFA selects strategies based on profit and not statistical relevance. This is not the way that most strategy developers who backtest strategies over long histories perform such analysis. WFA has critical flaw of not first determining relevance. The way the data scientist would approach the problem is first they'd look for statistical relevance before making any determination. WFA, as typically implemented, merely selects the highest performing strategy and trades that in the next period. In general, with strategy optimization, the goal is to find consistent or stable parameters that generate profit. These stable regimes could easily be washed out by variance or variability of returns of parameters due to unpredictable non persistent fluctuations or factors.
Of note, if you find that strategies that perform better in incubation under perform then this could be due to the common tendency for strategies/methods/traders that outperform in one period to underperform in the next. You could attempt to trade strategies that verified but only buy into them when the equity was in a downswing but within the normal range. You'd probably see a skew where most strategies that quit working fail but a few go on to outperform.
Yes, if you can make your WFA shorter then based on efficient markets, where past price loses its informational value
quickly, and still get a strong performance then you should have a stronger system. This informational loss component must be counter balanced, however, with the fact that most statistical methods require a certain number of cases to be considered relevant. As you move to the "right hand side", you become speculative. If you combine it with cognitive and complexity, you will move toward discretionary-like trading. Failing to add the cognition and complexity components is more likely to produce fragile systems with high variance.
Notice, that any tendency to under/out perform can be related as a function of informational entropy. As market participants discover what works, inevitably it causes those things to quit working. If you can determine the informational entropy of a strategy, the inflection points where strategies are more likely to quit working, then you could theoretically trade it and then fade it. This informational component might line up with either a periodical or anti-periodical basis aligning with the WFA. However, some strategies probably quit working for reasons …