I have been dipping my toes into the trading water as a bit of a hobby and like most new people doing badly. I have so far been 100% correct on the ultimate direction of my trades, but totally off in the timing - I guess this is a fancy way of same as saying I have been 100% wrong
Rather than continue down this path I have stopped, read and thought - probably something I should have done this before starting
Here are is what I have come up with so far. I would really like some feedback to see if I am on the track.
1. You need to find an edge to trade successfully long term. I suppose this is obvious, but the problem is how do you know that you have found a real edge and have not just data mined out a set of rules that seem to work? After all even if a movement series is truly random (eg radioactive decay), if you try enough rules combinations then you will find a set that appear to predict movement direction.
2. Any rule to future price correlation that can be mined out at a statistically valid level is obvious and has been arbed away. If I go data mining through millions of rules looking for a correlation with future price movements, then the p value of the null hypothesis would have to be so tiny that the rule would stick out like a sore thumb. This is similar to the same problem pharmaceutical companies face when they go data mining through the results of failed drug trial. It is very easy if you do this to find a "result" that shows the drug had an effect on some sub-population that appears to be statically valid, even when the drug is a sugar pill! It is for this reason that the FDA insists that any correlation result mined out of a drug trial are tested in an new independent trial before approval.
3. Rather than looking for correlations between market data and future prices I need to find the causes of price movements. Once a cause is found then a rule can be constructed and tested to see if it is statistically valid.
4. All publicly available causal data is already being used by at least one trader and so any edge that can be constructed from the data is already taken.
5. Finding an new edge require finding one or more unique data sets that are causal (or at least moved by the same underlying cause) of price movements. This might be best explained with an example. Imagine that I count the number of cars on each day in the car park of store and use this to predict the future earnings of the store. While earnings would not be perfectly correlated with the number of cars present, it is easy to see that if the car park is empty then there will be no customers and earnings will be poor. In reverse, if the car park is full of customers then sales will be good and earnings likewise all things being equal. Assuming that car counting is statically predictive of store future earnings, and earnings drive price movements, then I will have found a real edge.
The major problem I can see with this approach is the either the predictive value of any unique data set will be so low that transaction costs will eat up any trading advantage, or the cost of gathering the data is so high that they exceed the possible extractable trading profits. The major advantage is I at least have a chance of finding a true edge not being exploited by anyone else. What are peoples thoughts?
Rather than continue down this path I have stopped, read and thought - probably something I should have done this before starting
Here are is what I have come up with so far. I would really like some feedback to see if I am on the track.
1. You need to find an edge to trade successfully long term. I suppose this is obvious, but the problem is how do you know that you have found a real edge and have not just data mined out a set of rules that seem to work? After all even if a movement series is truly random (eg radioactive decay), if you try enough rules combinations then you will find a set that appear to predict movement direction.
2. Any rule to future price correlation that can be mined out at a statistically valid level is obvious and has been arbed away. If I go data mining through millions of rules looking for a correlation with future price movements, then the p value of the null hypothesis would have to be so tiny that the rule would stick out like a sore thumb. This is similar to the same problem pharmaceutical companies face when they go data mining through the results of failed drug trial. It is very easy if you do this to find a "result" that shows the drug had an effect on some sub-population that appears to be statically valid, even when the drug is a sugar pill! It is for this reason that the FDA insists that any correlation result mined out of a drug trial are tested in an new independent trial before approval.
3. Rather than looking for correlations between market data and future prices I need to find the causes of price movements. Once a cause is found then a rule can be constructed and tested to see if it is statistically valid.
4. All publicly available causal data is already being used by at least one trader and so any edge that can be constructed from the data is already taken.
5. Finding an new edge require finding one or more unique data sets that are causal (or at least moved by the same underlying cause) of price movements. This might be best explained with an example. Imagine that I count the number of cars on each day in the car park of store and use this to predict the future earnings of the store. While earnings would not be perfectly correlated with the number of cars present, it is easy to see that if the car park is empty then there will be no customers and earnings will be poor. In reverse, if the car park is full of customers then sales will be good and earnings likewise all things being equal. Assuming that car counting is statically predictive of store future earnings, and earnings drive price movements, then I will have found a real edge.
The major problem I can see with this approach is the either the predictive value of any unique data set will be so low that transaction costs will eat up any trading advantage, or the cost of gathering the data is so high that they exceed the possible extractable trading profits. The major advantage is I at least have a chance of finding a true edge not being exploited by anyone else. What are peoples thoughts?
