It sounds like you have no experience at all with automated trading systems. On the other hand, I am one of NinjaTrader's partners, specialized in automated trading systems - which puts me in the vendor category, but I can also give you a few pointers.
First, some systems are not tradable - by that, I mean one of the following:
- Their realtime behavior is different from their backtest behavior - very often the case for strategies on NinjaTrader which require CalculateOnBarClose=false, but can also happen with "specialty" bar-types, which H/L in backtest doesn't match the realtime H/L.
- They hold positions for extended periods of time (days or weeks), but offer no support to stop & restart the strategy without losing the management of the open-trade.
By running that system during a few weeks on Sim101, and backtesting at the end of each week for the week just ended, you'll get an idea whether any of this applies or not.
As for the backtest performance, you should ensure that the backtest is done using at least 1-tick of slippage for MKT & STP orders, and that the fill-type used is at least Default (never Liberal), but I would recommend using a filltype which actually guarantees the requested slippage (that's not the case for Default). Feel free to use the attached "BetterThanDefaultV3FillType", which you need to manually unzip, then copy to NinjaTrader/bin/Custom/Type, and after that recompile any indicator or strategy and it will become available.
Once you have a backtest with a guaranteed 1-tick slippage, look at the average net per-trade, in ticks. The minimum viable is a matter of taste, I trade some systems with a 5-tick avg.net/trade, but it is much better to see 10+ ticks, and probably anything at or under 3-ticks is a waste of money.
Another aspect to look at, is the entry-stop size, in relation to the entry-target size, as well as the % of entry-stop hit, vs % of entry-target hit. No hard & fast rules there (for me), but keep in mind that the % of entry-stop hit might increase over time.
A key aspect, which is very difficult to judge, is the likelihood of the system to continue performing decently in the future. I use a pretty elaborate method for my own systems, but without access to the system design, you are pretty much left with judging based on the number of trades in backtest (
for a single setup, which you never know for sure is the case when you purchase a blackbox). Anything under 1000 trades in backtest will experience significant variations in the future, 1000-2000 would be the place to start, 3000+ gives you better chances of long-term stability.