Sorry, I didn't realize you were talking about backtesting. I kinda skimmed over your post.
I think to determine whether a system is statistically significant, the designer should use a fixed number of contracts in their backtest. Thereafter, position sizing methods may be modelled...