I use random out sample periods e.g testing from 1997 to 2018, this gives 30% random quarters as out of sample.
Also, is it not wise to have as much test sample data as possible? This gives the systems more data to train and should improve predictive power.
Do you ever have a market inefficiency you want to exploit and then test the basic entry principle behind this potential inefficiency. If this passes the in sample across multiple correlated markets then immediately test the out of sample? This way you don't waste weeks or months trying to build a system around a principle rule which had no edge, to begin with. Is this okay to do or should one finish all the entry criteria (can take weeks) then test the finished entry model on the out of sample?
Well, there's not necessarily a clear line between "basic entry principle" and the final entry criteria. You start with the fact that the market can be a buy or sell at any instant, and then apply filters layer by layer to isolate to +EV periods/events. Every such filter should raise expectancy and reduce the number of trades.
The point of out-of-sample testing is to avoid mining coincident patterns in the data that happen to hold over the training and test periods. Each filter you apply increases the risk of this somewhat, and a variety of additional factors can increase it further:
- Use of "magic numbers", especially where small changes to the number result in large changes to system performance, number of signals, etc.
- Use of filters with no clear connection to the inefficiency/tendency being exploited, or a clear reason why they "should" work
- Excessive layers of filters
- Filters which reduce the number of trades excessively
- Identifying filters or filter parameters by testing large sets of them and discarding those which don't work
- Failing to test in a variety of market conditions, volatility levels etc
The more of these factors which apply to your system, the more important it is to validate on out-of-sample periods.