Help with Correlation Analysis ->

Hey fellow ET'ers, need some help:

1.) I want to run a correlation matrix/analysis on 7 markets going back at least 10 years. Any insight to a website or software that can pull that info together and run the analysis (based on daily or even weekly bars).

2.) I would also like to run a correlation analysis on equity curves on seven markets. Meaning, I have individual equity curves and would like to run the correlation they have all together. I assume I need to build something in Excel, but looking for advise.

Thanks!
 
You might be able to get the data from yahoo or google finance, depending on the markets.

I've never tried correlation analysis, other than seeing how the general stock index looked when I made long or short trades on individual stocks. I found huge correlation just by plotting my trades on the index chart and noting the success rate when I was aligned with the prevailing trend at the time.

You may try building a correlation matrix in excel based on a simple trend indicator like a moving average.

With a little intellectual exertion, this could be built in excel in a matter of a few days.

If you're not into that sort of thing, an engineering intern could do it. They'll work for about $17 an hour.

:D
 
import pandas as pd
from pandas.io.data import DataReader

symbols = ['MSFT', 'GOOG', 'AAPL']
data = dict((sym, DataReader(sym, "yahoo"))for sym in symbols)
panel = pd.Panel(data).swapaxes('items', 'minor')
close_px = panel['Close']
rets = close_px / close_px.shift(1) - 1
rets.corr()


that will be $10,000
 
Quote from rosy2:

import pandas as pd
from pandas.io.data import DataReader

symbols = ['MSFT', 'GOOG', 'AAPL']
data = dict((sym, DataReader(sym, "yahoo"))for sym in symbols)
panel = pd.Panel(data).swapaxes('items', 'minor')
close_px = panel['Close']
rets = close_px / close_px.shift(1) - 1
rets.corr()


that will be $10,000

Thanks for noting pandas, I had never heard of it. I've been looking for something to do with python to learn it.
 
Quote from Wide Tailz:

With a little intellectual exertion, this could be built in excel in a matter of a few days ... [at] about $17 an hour.
More like ten minutes.

Put your 7 10-year price series in columns A through G.
That should be about 2550 rows. Then highlight cells I1
through O7. A 7x7 matrix of cells should be highlighted.
Enter the following forumla in the top left cell in the
highlighted range:

MMULT(TRANSPOSE(LN(A2:G2550/A1:G2549)),LN(A2:G2550/A1:G2549))/2548

This is an array formula, so press [ctrl][shift][enter]
instead of just [enter] when you are done.

This is your variance-covariance matrix Actually is is
your scatter matrix, as the var-covar requires demeaned
data. However I wouldn't worry about it as the mean
of daily log returns is rarely significantly different
from zero.

Now highlight another 7x7 range in cells I9 through O15.
In the top left cell enter the following formuala:

SQRT(I1:O7*IF(ROW(OFFSET(INDIRECT("A1"),0,0,7,7))=COLUMN(OFFSET(INDIRECT("A1"),0,0,7,7)),1,0)

Again, this is an array formua so use [ctrl][shift][enter]
The numbers in the diagonal of this 7x7 matrix are the
standard deviations of each column of log returns.

Now highlight another 7x7 range in cells I17 through O23.
In the top left cell enter the following formuala:

MMULT(MINVERSE(I9:O17),MMULT(I1:O7,MINVERSE(I9:O17)))

Use [ctrl][shift][enter]
The off-diagonal elements are the pairwise correlations.

You might be more interested in the partial or conditional
correlations. However Excel, to my surprise does not have
a matrix square root function. And since they have neither
a Cholesky nor an Eigen depompostion built in, it is not
easy to roll your own. You can, if you like, derive the
partial correlations, the multiple r-squareds, and the
beta coefficients on a cell-by-cell basis from the inverse
correlation matrix. To do that highlight another 7x7 range
in cells I25 through O31. In the top left cell enter the
following formuala:

MINVERSE(I17:O23)

Use [ctrl][shift][enter]
To get the partial correlation of off-diagonal element
e(i,j) use p = -e(i,j)/sqrt(e(i,i)*e(j,j))
For r-squared on diagonal elements e(i,i) use
R = sqrt(1-1/e(i,i)
For Beta coefficients on multiple regression
of diagonal element e(i,i) vs any other e(j,j) nse
use b = -1/e(i,j)


There you go. And I charged less than $17 an hour
for it.
 
Quote from Wide Tailz:

What a guy. It pays to know how to ask.

:D

I made at least one typo in that post and it is too late to edit.

the formuaa:

MMULT(MINVERSE(I9:O17),MMULT(I1:O7,MINVERSE(I9:O17)))

should be

MMULT(MINVERSE(I9:O15),MMULT(I1:O7,MINVERSE(I9:O15)))
 
Quote from Kevin Schmit:

I made at least one typo in that post and it is too late to edit.

the formuaa:

MMULT(MINVERSE(I9:O17),MMULT(I1:O7,MINVERSE(I9:O17)))

should be

MMULT(MINVERSE(I9:O15),MMULT(I1:O7,MINVERSE(I9:O15)))

You're too good!
 
Quote from Kevin Schmit:

More like ten minutes.

Put your 7 10-year price series in columns A through G.
That should be about 2550 rows. Then highlight cells I1
through O7. A 7x7 matrix of cells should be highlighted.
Enter the following forumla in the top left cell in the
highlighted range:

MMULT(TRANSPOSE(LN(A2:G2550/A1:G2549)),LN(A2:G2550/A1:G2549))/2548

This is an array formula, so press [ctrl][shift][enter]
instead of just [enter] when you are done.

This is your variance-covariance matrix Actually is is
your scatter matrix, as the var-covar requires demeaned
data. However I wouldn't worry about it as the mean
of daily log returns is rarely significantly different
from zero.

Now highlight another 7x7 range in cells I9 through O15.
In the top left cell enter the following formuala:

SQRT(I1:O7*IF(ROW(OFFSET(INDIRECT("A1"),0,0,7,7))=COLUMN(OFFSET(INDIRECT("A1"),0,0,7,7)),1,0)

Again, this is an array formua so use [ctrl][shift][enter]
The numbers in the diagonal of this 7x7 matrix are the
standard deviations of each column of log returns.

Now highlight another 7x7 range in cells I17 through O23.
In the top left cell enter the following formuala:

MMULT(MINVERSE(I9:O17),MMULT(I1:O7,MINVERSE(I9:O17)))

Use [ctrl][shift][enter]
The off-diagonal elements are the pairwise correlations.

You might be more interested in the partial or conditional
correlations. However Excel, to my surprise does not have
a matrix square root function. And since they have neither
a Cholesky nor an Eigen depompostion built in, it is not
easy to roll your own. You can, if you like, derive the
partial correlations, the multiple r-squareds, and the
beta coefficients on a cell-by-cell basis from the inverse
correlation matrix. To do that highlight another 7x7 range
in cells I25 through O31. In the top left cell enter the
following formuala:

MINVERSE(I17:O23)

Use [ctrl][shift][enter]
To get the partial correlation of off-diagonal element
e(i,j) use p = -e(i,j)/sqrt(e(i,i)*e(j,j))
For r-squared on diagonal elements e(i,i) use
R = sqrt(1-1/e(i,i)
For Beta coefficients on multiple regression
of diagonal element e(i,i) vs any other e(j,j) nse
use b = -1/e(i,j)


There you go. And I charged less than $17 an hour
for it.

looks complicated compared to my approach which took about 3 minutes.
 
Back
Top