You’re overthinking this. All such theories and conclusions about how to identify, filter and select strategies will change over time as you work on them. You’ll be changing your assumptions hundreds of times along the way. But even making wrong assumptions will allow you to discover new things and edges, so don’t worry about going into wrong direction as that may become your main edge afterwards (things that others dismiss outright).
Selecting strategies that work better than random is a given, while at the same time unnecessary because you won’t find almost any random strategies that will give you high chance of winning like 75%+. While when you do find such seemingly random strategies then you’d focus on finding out why it works, because this could mean a bug or a bias that you must identify first, before basing your system off of it.
So then assuming that random strategies won’t make you rich, you can select any strategies that win more often than 75% of the time as having decent potential.
But later you’ll discover that strategies that win 50% of the time may work just as well if they generate 5% profit on average vs losing 2%.
That’s when you’ll also realize that your stop loss could be at a much different level than your profit target.
Later you’ll also find tons of other criteria that will be unique to you, as otherwise you cannot be different from everyone else that is trying to sell you their strategies.
Though using last year of data for walk forward can be important.
You’ll also discover that drawdowns are unavoidable and therefore you’ll have to deal with risk and can’t just use stop losses for everything or otherwise you may kill good strategies.
Finally, depending on what you plan to trade, your biggest issue may be getting clean data. There are plenty of trades reported minutes of hours late by dark pools that cause false spikes in the data that your system will learn and base strategies on.
I’ve been working at this for 3 years and most of the time when I thought I discovered something great, it was because of bad data. So at the same time I’ve spent 3 years cleaning the data, looking for new data sources, and even then still cleaning their data. There is no data provider on this planet that I haven’t contacted dozens of times reporting bad data to them, until I finally decided to purchase tick data and implement my own custom filtering when summarizing to minutes.
Though now I’m adding back some fake data spikes to make my system resilient to random events. While with clean(er) data, creating strategies is much easier. A bigger problem now is selecting strategies among millions that all have downsides vs benefits.