Quote from jspauld:
Okay, you are talking about overfitting. I was very aware of this as I setup my system. It really wasn't much of an issue. (ie my system performed almost as well on new data as it did on training data.) It's pretty easy to check for this by assigning a portion of your training data as validation data. Also, keep in mind all of my indicators were for very short term market movements and so each day gave me thousands of data points for each indicator.
The money from selling my first business got used up traveling and starting the second. With regard to giving up easily. I do not feel I did this. As mentioned I spent four months trying everything I could to improve profitability. I even paid $20,000 to get a new data stream to see if it helped - it didn't. On the other hand, my passion is startups and not really finance so maybe I did give up to easily.
I can't really respond to the 6 months thing other than to say that seems like a decent chunk of time to me. Especially considering I never knew for sure it was going to work.
Your PL curve proves that you did overfit -- if you think that a year of profits is "proof" that your system did what you intended to do, then you probably haven't studied random walks enough. By the way, you said earlier you used hundreds of variables to fit the short term price action. Then you said you used thousands of data points so you are pretty sure it's enough. However, did you know that the number of data points is irrelevant here? What you need to know is how many degrees of freedom of signal that 1000s of data points contain. And it sure is not 1000s of DOFs. More like 2 or 3 or 5. The only way to know is to use an information based criteria, like AIC, or partial F to compare models of different complexities.
Your comment about not knowing that it was going to work is precisely why exploratory research takes much longer than most anticipate in this business. Usually most people who do research will try 100s or 1000s of things and perhaps find a few things that may have a real edge. On the other hand, if you overfit without knowing it, then you will find something very quickly, guaranteed. And you seem to think that machine learning will somehow magically solve things. I have used ML professionally in other contexts (not trading) such as SVMs SOMs ANNs, Bayesian nets, logistic regression, and they are no magic bullets. They are just part of a class of statistical modeling algorithms with their own strengths and weaknesses.
To reduce the fishiness of your article, would like to see more statistics on the trades, such as overall sharpe ratio, max draw down, total number shares traded, cents per share earned, average daily PL, max daily positive profit, max daily loss, number winning days, number losing days, average positive profit, average loss, plot of cumulative PL vs time, etc.
Anyway, like CT10Gov says, I suppose if you really claim to have done everything you say in the article, then (a) you did put in a lot of work into it, and (b) you did get extremely lucky and there is nothing wrong with both those things.