Quote from TheStudent:
It is clear that assuming p > 0, as n increases, your score is going to increase, and that you can theoretically seek for any score you want by increasing n.
Your score will only increase if your average stays the same and your standard deviation stays the same. That said, you are correct, if the average and standard deviation remain unchanged, as you add more and more data points into the picture, these data points rightfully further solidify your expectation that this IS a profitable system that you are assessing. => In other words, yes, the confidence level will rise with increasing supporting data.
Often, though, we don't have available to us 1000's of valid data points with which to analyze, and we must make our best determination of the profitability of a system based upon a smaller data set. In this situation, this analysis method can be helpful, IMO.
Quote from TheStudent:
I guess to bring this a step further, would you say that for any given n, the corresponding score will tell you the confidence of being profitable for a period of length n?
Thus : if n = 100 and the score corresponds to 98%, then we can say that the system has a 98% confidence in being profitable every 100 trades. If the same system has a score of 80% at n = 50, then we can say that at 50 trades, the confidence of being profitable drops to 80%.
That is close, but I think I should clarify. Think of a system as having 10,000 P&L data points available, but you can only look at 50 of them (and perhaps the other 9950 data points are only known in the future). By looking at ONLY the 50 datapoints, our goal is to estimate the likelihood that the AVERAGE of all 10,000 datapoints is above zero (i.e. profitable system). So, if the results of our calculations say the system is profitable at the 98% confidence level, then that is saying that it is 98% likely that the entire population of 10,000+ trades will have an average P&L above zero, based upon our analysis of the 50 trades we saw. There is still a 2% chance that we just got a lucky 'sample' of data, and that the entire population of trades will have a negative average P&L. (Note as always that this analysis is subject to some errors due to not having a P&L distribution that is Gaussian, etc).
Quote from TheStudent:
Thus, the question becomes - say if we are tracking the last 100 trades of a system and working out the corresponding score based on n = 100, and then dropping the system if the score falls below 80% or whatever benchmark we set.
How do we know 100 is the appropriate length to use? How do we know 80% is the appropriate cut-off to use?
Should we just be arbitrary based on our "judgement" or is there a rigourous way of approaching this problem?
I think once you have used this technique for a while, you will start to get a sense on what numbers are needed to give you sufficient confidence in your system in order to trade it. I mentioned the values that I use, but that will not be appropriate for everyone. Recall from Acrary's thread that he didn't want to trade a system unless he was assured that 99%+ of all trading months were profitable. For me, I don't need that level of confidence in order to trade a system, but each trader must answer that question for himself/herself.
-Eric