Why do I see "Trends" in Randomly Generated Data?

MAESTRO · Mar 26, 2008

Quote from Jerry030:

Criteria:

Random Number = Excel RAND() assigned to each bar and recalculated for each run

Random selection for trade: Rand() > .30 AND < .3205
which gives about 200 trades, of which about 50 will have a fill based on the Stop Limit order cited in the Rules

For Predictive: value of NN model predicts higher prices tomorrow, using same Stop Limit order

Time Window 11,000 bars, all available data

Oh I see now. The confusion here is that you mistake notion "Random" with the notion "Normally Distributed" ("Gaussian") process. Yes, it can be shown that a symmetrical game (a game with equal "buy" or "sell" rules for example) has different outcomes if played on the random processes with different types of distribution. However both of those distributions are observed on Random processes.

Jerry030 · Mar 26, 2008

Quote from MAESTRO:

I am not trying to mock you or anything. As the matter of fact I am glad that you asked the question. It shows the interest and it is good enough for me. I welcome any interest. But sometime it is difficult to seriously answer the question that has a faulty logic in it. I probably will be much more helpful if I understood what is it that you would like to know. I also could be more helpful if I understood your hypothesis. If I sounded facetious please forgive me. It wasn't intentional. So, please explain what is the meaning of your question.

OK, thanks.

Question: What level of occurrence of an event indicates non-random behavior?

For example in a state Lottery if on average 1,000,000 people participate each week and a single individual won the jackpot in 37 of 52 weeks during the year there would be a question as to the random distribution of the winners. In fact there would be an investigation by the authorities since the probability of such a "winning" steak in a random lottery, while not 0 is so small that its occurrence would set off alarm bells for the lottery management agency.

Or if a trader makes 1000 trades and 900 are profitable what is the probability that such an event is random chance?

MAESTRO · Mar 26, 2008

Quote from Jerry030:

OK, thanks.

Question: What level of occurrence of an event indicates non-random behavior?

For example in a state Lottery if on average 1,000,000 people participate each week and a single individual won the jackpot in 37 of 52 weeks during the year there would be a question as to the random distribution of the winners. In fact there would be an investigation by the authorities since the probability of such a "winning" steak in a random lottery, while not 0 is so small that its occurrence would set off alarm bells for the lottery management agency.

Or if a trader makes 1000 trades and 900 are profitable what is the probability that such an event is random chance?

I see better now where you are coming from. As I said in my previous post, you are probably trying to state that the markets are not Normally distributed. This hypothesis gives you a method of shifting the win/loss ratio to your favor. That's totally understandable. I can even show you a very simple example how to construct a winning algorithm if you know the parameters of your " Stable Distribution". So, no arguments there. If you agree that the markets consist of multiple random processes with different types of distributions and those distributions are more or less stable then we are in agreement.

Jerry030 · Mar 26, 2008

Quote from MAESTRO:

There is only way by which one can prove "non-randomness" of the markets: It is by showing that the outcome of any given trade can be known ahead of time with 100% accuracy. Any other outcome will prove that the markets are random.

This statement makes no sense to me in practical terms in the real world.

What category of events occur 100% of the time? None that I can think of off hand.

If such criteria were applied to medicine then no drugs would ever be prescribed for any illness since the cure rate is in the 90% area at best and the kill rate due to side effects is > 0.
However drug companies can still make a decision on a medicine based on the probabilities of cure vs. kill, when neither are ever 100% or 0%. If they were random then they could save a ton of money on R&D by just offering sugar pills for all illnesses.

So why does a rate of < 100% in the market = Random and an occurrence rate <> to 100% indicate an effective medicine in health care?

Why hold the market to a special standard?

On the other hand do you believe that all phenomena in the universe are random?
If so then could be no assumption of cause and effect in any event.

Heck, there is some probability that a meteor will fall on my head before I finish typing this. If the real world required 100% certainty for everything, nothing would happen. We would all cower in the basement forever.

Another Example:

What is the minimum heart rate over 10 hours for a human being to remain alive? I don't have the number but there are adepts at Yoga and such things that can reduce their heart rate and respiration to a level that would be fatal to the rest of us. Are they random or do they have a special skill?

dtrader98 · Mar 26, 2008

Quote from Jerry030:

Criteria:

Random Number = Excel RAND() assigned to each bar and recalculated for each run

Random selection for trade: Rand() > .30 AND < .3205
which gives about 200 trades, of which about 50 will have a fill based on the Stop Limit order cited in the Rules

For Predictive: value of NN model predicts higher prices tomorrow, using same Stop Limit order

Time Window 11,000 bars, all available data

I think I follow the algorithm a bit. Without trying to ascertain the specifics, you are looking at a predefined decision rule (although they are both slightly different, why not find a rule and apply it identically to both cases, i.e. rather than chose between .3 to .32 for entry criteria, why not use close +/-high/low relative closing criteria from real data nn to random data set: apples to apples) and comparing it to a random set.

Firstly, as you are using 11,000 samples, it seems fair to use rand() function as the uniform dist permutation characteristics are a close approximation of gaussian (binomial approaches gaussian over long runs).
However, again to compare apples to apples (as each gain of the random run may not be comparable to each gain of the real data run, although maybe ratios are ok), it seems better to model with GBM as I mentioned earlier, and RUN A VERY LARGE monte carlo of the PL runs (not just 6 as in your example). Look at the range of PL over the simulations. By virtue of randomness and the model boundaries (unless you have large jumps in the market data you used), by its very definition, the real results you ran should be contained within the data set.

If it is not, the GBM data is not sufficient to model the random data behavior of the market you are testing. As maestro mentioned, you could get into more detailed models such as stable distributions or jump diffusions to get more accuracy. If you find that your real market results are on the higher end of the PL distribution from the monte carlo analysis then that shows your system is better than the average random results, but by it's very nature, its performance must be a subset of the monte carlo results or the random data is not being modeled properly.

If you tested this across all markets and all data, although it would be inconclusive (since it is not a closed system), you could argue that your system performed better than average over random data.
If you found that the market data was closely modeled by gaussian dist, you could then apply some sort of confidence interval to test your hypothesis (i.e. if your results were greater than 95% of monte carlo, you could say your results were statistically better than a system applied to random data).

However, as (I think) mastro argues, it does not guarantee equal likelyhood of future performance. Your next run of 11,000 data pts. may perform at the lower end of the monte carlo curve.

That being said, I would argue that your approach (if it is in the upper percentile of PL results), is much better than impulsively visually trading off TA signals.

MAESTRO · Mar 26, 2008

Quote from Jerry030:

This statement makes no sense to me in practical terms in the real world.

What category of events occur 100% of the time? None that I can think of off hand.

If such criteria were applied to medicine then no drugs would ever be prescribed for any illness since the cure rate is in the 90% area at best and the kill rate due to side effects is > 0.
However drug companies can still make a decision on a medicine based on the probabilities of cure vs. kill, when neither are ever 100% or 0%. If they were random then they could save a ton of money on R&D by just offering sugar pills for all illnesses.

So why does a rate of < 100% in the market = Random and an occurrence rate <> to 100% indicate an effective medicine in health care?

Why hold the market to a special standard?

On the other hand do you believe that all phenomena in the universe are random?
If so then could be no assumption of cause and effect in any event.

Heck, there is some probability that a meteor will fall on my head before I finish typing this. If the real world required 100% certainty for everything, nothing would happen. We would all cower in the basement forever.

Another Example:

What is the minimum heart rate over 10 hours for a human being to remain alive? I don't have the number but there are adepts at Yoga and such things that can reduce their heart rate and respiration to a level that would be fatal to the rest of us. Are they random or do they have a special skill?

I think I have already answered this question. Just to reiterate: What you are saying is that there are 10 balls in a hat, 2 are red and 8 are black. In this case the probability to pull out a black ball is higher then to pull out a red ball. However the process of pulling balls out from this hat is still random.

Jerry030 · Mar 26, 2008

Quote from MAESTRO:

I see better now where you are coming from. As I said in my previous post, you are probably trying to state that the markets are not Normally distributed. This hypothesis gives you a method of shifting the win/loss ratio to your favor. That's totally understandable. I can even show you a very simple example how to construct a winning algorithm if you know the parameters of your " Stable Distribution". So, no arguments there. If you agree that the markets consist of multiple random processes with different types of distributions and those distributions are more or less stable then we are in agreement.

I make no assumption about the nature of the markets per se only that it is possible to correctly predict something about their future behavior, based on historical behavior with a level of accuracy that gives significant profit.

Perhaps we are saying the same thing, I don't know.

neke · Mar 26, 2008

Quote from Jerry030:

I make no assumption about the nature of the markets per se only that it is possible to correctly predict something about their future behavior, based on historical behavior with a level of accuracy that gives significant profit.

Perhaps we are saying the same thing, I don't know.

It would be sad if you happen to be saying the same thing after all this long debate. Have words lost their meaning?

Kevin Schmit · Mar 26, 2008

Quote from MAESTRO:

I can even show you a very simple example how to construct a winning algorithm if you know the parameters of your " Stable Distribution"

Ok, I have a Levy Stable Distribution of scale c, shift u, exponent a, and skew b. I can take an infinite number of random draws from that distribution to create simulated trading runs. Please show me a simple winning algorighm for trading those runs.

Jerry030 · Mar 26, 2008

Quote from neke:

It would be sad if you happen to be saying the same thing after all this long debate. Have words lost their meaning?

The debate has become tiring and unproductive so you look at the cost benefit ratio of words expended to your profit and make a decision.

My position is that the past can predict the future under certain conditions... if your want to say that is because you have an edge on the slant of the gaussain distribution multiplied by the alpha of the fizbit factor, fine with me.