You're right that B is a problem, but not necessarily always a problem. Reading through this thread it's apparent that there are 2 groups talking past each other a bit. Group 1 is rightly cautious of data mining, which is what you're describing. As a result, they feel that using testing to find an edge is always bad. It generally is bad using the standard model most of us use that you describe well, the tweak/test/fail, tweak/test/fail.... model.I think you are confusing hardly requires any testing with don't test at all.
IMO the process should be:
1. you have a model that everything about it says it should work.
2. you test it on data
a. it works and gives reasonable results inline with expectations before the test
b. it fails, you scrap the model and start over
B is the problem. You can't do tweak/test/fail, tweak/test/fail, tweak/test/fail, tweak/test/success!
The hysterical thing to me is that is exactly what I thought the point of backtesting was when I started.
There is, however, another group out there, many professionals, who use an entirely different model somewhat like a monte carlo simulation which is to simultaneously test millions of random permutations of a strategy on a data set. They then use some techniques like various out of sample tests and testing during different market conditions to winnow that down to a few hundred strategies, then examine those to determine if there's a thesis to support them or if they're just data mining results or just throw money at all of them if they've done some statistical analysis to show that it's OK to have some spurious data mined results in there. Nothing wrong with this technique, a few of the successful big hedge funds use it, it's just probably not something most of us would do. So when group 2 on this thread is talking about backtesting for a strategy as viable, they're probably talking something like this. And it's perfectly OK for group 2's strategy to be viable while simultaneously saying the tweak/test/fail model isn't generally viable.