do you mean the logic as several "rules"?
Not necessarily.
I backtested my system when it was in development without doing any optimization. I searched for situations where my system got in trouble and tried to find a global improvement in the basics of my system. So not in testing variable parameters and optimize the system for that specific set of data.
A system should be build on logic. If something does not work it means the logic is not good, so you don't fix that with an optimization of parameters. You improve your logic.
To me a good system should perform well in any market within changing any parameter.
Within certain limits parameters in a good system can be changed without influencing the performance. That shows that the system is well balanced, and does not get affected by "noise".
The basic logic of my system never changed the last 25 years and still works. There is some kind of logic in the market behavior, markets are not random. Question is how to find what the logic is.
I tried to find a logical solution, not a mathematical one. And when the logic is OK, the math in general is OK too. As math is about logic.
Not necessarily true, data mining is an industry.I agree completely, You start with an idea that makes sense and then look in the market, in the past if it would make money. But 99% of the people that do backtesting, just look at the data and then try to find a system that would work on that data. That is already over fitting...
That is only 10 percent of the problem. The biggest problem is that people think the data they are using represents the market. Like a daily chart or any time frame with an open, high, low, close and bid/ask volume represent the market. That is mistake nr 1 in the whole industrie of backtesting. Second mistake is using that data as a basis to do all sort of calculations. All that data leave out the biggest reason why price move. And if they do find a system that work with the data they have, it doesn't work because of the data or analysis but because of the position sizing they are using.Not necessarily true, data mining is an industry.
biggest problems arise when this is mixed with hope and cognitive bias
two method:Not necessarily true, data mining is an industry.
biggest problems arise when this is mixed with hope and cognitive bias
Not necessarily true, data mining is an industry.
biggest problems arise when this is mixed with hope and cognitive bias
I understand how you think, but actually behavioral and social science relies on math. That’s why you see so many people with this background in data science. Math is not necessarily ‘linear’.Datamining misses the most important factor: creativity, thinking out of the box.
In fact datamining does not think at all. It can make huge amounts of calculations, but that's all it can do. The information that humans give define the rate of success of datamining. The computer will never have own/new ideas or think out of the box.
If you ask a computer: why this result.
He will "tell" you that all he can do is math. So it is a mathematical result. But markets are not mathematical. That's why price can be overbought or oversold.
Behavioral finance is very important in trading. And that's something you don't use in datamining. Datamining is exact science; markets are not.

Lets say I have a set basket of 50 ETF's that I like to backtest and trade with. I run a backtest and get good results and decide to trade a strategy, but I notice about 10/50 of the ETF's perform poorly with the chosen parameters.
1. Would it be considered over-fitting to exclude those 10 ETF's from live trading
2. Doubling down here: if I did exclude the 10 poor peformers, how bad would it be to re-optimize on the remaining 40/50 ETF's.