You can buy historical data from a number of cheap sources. I think historical option data has a black friday sale.
It comes down to not going back to changing parameters. An alternative for small data is to use block-based resampling to generate more data based on some underlying distribution. You can split the set into n/2 training sets and n/2 test sets as well (sequential) and see how it runs. You can also attempt to walk-forward test it to remove any lookback bias.
Actually, is this over fitting thesis true? This came up while I was on a walk with the dog, so it is possible the dog was channeling her thoughts through me.
Say you are selling options 30 days out on SPY on either side of the current price. So you'll sell puts and calls that are ITM by more than (say) 1/2 ATR.
There are only a few possibilities:
1. Price doesn't move at all (can't happen on something like SPY, but OK at least you still get the premium)
2. Price moves up and keeps going
3. Price moves down and keeps going
4. Price moves up/down, reverts
In any of these cases except the first, you will have a drawdown of one side and a profit on the other side. If you are using 1/2 ATR as your base, then it is entirely possible that you will get a 100% movement (i.e., another 1/2 ATR) on one side.
So logically, it makes sense.
I reserve the right to 1) blame the dog, 2) blame my lack of sleep.