Quote from rvince99:
I KNOW that a matrix of probabilties of cross scenarios has less information loss than the simple, single metric of correlation -- particularly in the tails. I have written about this at length, performed ample studies on it, and have experienced the benefits and consequences of both, firsthand.
Suppose I have 2 components I am looking to allocate among. Say, 98 periods one component loses 1 unit, the other, gains 2 units (the subsequent period, the reverse occurs, the former now gains 2 units, the latter loses 1 unit. They keep flipping like this, with a net gain of 1 unit, for 98 perdiods. If this were the only data, our correlation coefficient would be -1.0). Then there is the one period where they both lose 10, simultaneously, and the 100th period where they both gain 10. My correlation coefficient in this case, over the 100 periods, is -.04753. That single parameter would be used to describe the relationship of these two streams -- yet, there is a lot of information going on in there -- some really BAD stuff two on that solitary period of -10,-10.
Contrast using this single metric with the notion of using a matrix of joint probabilities:
p A B
.01 -10 -10
0 -10 -1
0 -10 2
0 -10 10
0 -1 -10
0 -1 -1
.48 -1 2
0 -1 10
0 2 -10
.48 2 -1
0 2 2
0 2 10
0 10 -10
0 10 -1
0 10 2
.01 10 10
Which has more information? Which is more valuable on the disaster days?
This is only 100 days. The outliers in real life tend to occur far less than .01,
so your correlation coefficent, r, would typically be far more negative than shown here. (Incidentally, this matrix is the only thing one needs to gather to employ a leverage-space type model)
What about using the Spearman Rank correlation? - Looks like providing a bit less information, but worth considering...