Hi everyone,
I am trying to replicate an ETF using the least amount of components as possible. There are really 2 parts to my questions. 1) How to explain the most variance and 2) How to size the position*
The idea is, if I am trying to replicate QQQ, buying the top 5 holdings may not be the best bet since AAPL is very similar to MSFT. Also there might be a stock way down the list with a vol of 100% that explains variance in QQQ that is not correlated to AAPL, MSFT, etc...
I was thinking of doing a PCA regression - grab all components of QQQ and see which loadings I should use. The issue is, there are too many loadings! So the problem is still not solved.
Does anyone have ideas or links for modern day dispersion trading?
Note* I am looking at this through a vol lens not D1.
For a case study, I have attached a data.frame for QQQ and the components over the last 2 years. I also naively zeroed out large outliers in the dataset.
Thank you for your time
I am trying to replicate an ETF using the least amount of components as possible. There are really 2 parts to my questions. 1) How to explain the most variance and 2) How to size the position*
The idea is, if I am trying to replicate QQQ, buying the top 5 holdings may not be the best bet since AAPL is very similar to MSFT. Also there might be a stock way down the list with a vol of 100% that explains variance in QQQ that is not correlated to AAPL, MSFT, etc...
I was thinking of doing a PCA regression - grab all components of QQQ and see which loadings I should use. The issue is, there are too many loadings! So the problem is still not solved.
Does anyone have ideas or links for modern day dispersion trading?
Note* I am looking at this through a vol lens not D1.
For a case study, I have attached a data.frame for QQQ and the components over the last 2 years. I also naively zeroed out large outliers in the dataset.
Thank you for your time
Code:
# A tibble: 6 x 101
QQQ AAPL MSFT AMZN TSLA GOOG FB GOOGL NVDA PYPL CMCSA INTC
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 -0.00434 -6.51e-3 -1.31e-2 -0.00560 0.0431 -0.00468 -0.00259 -0.00580 1.51e-2 -0.0112 -1.77e-2 -0.00414
2 0.0159 1.24e-2 2.13e-2 0.0324 0.0448 0.0196 0.0153 0.0198 -9.82e-4 0.0206 1.50e-2 0.0237
3 -0.00612 -1.54e-2 -5.82e-3 -0.00607 0.00122 0.00337 -0.00813 0.00329 -1.73e-2 -0.00991 2.31e-4 -0.00418
4 -0.0195 -2.70e-2 -2.05e-2 -0.0151 -0.0324 -0.0129 -0.0212 -0.0122 -3.75e-2 -0.0171 -1.25e-2 -0.0144
5 -0.00252 1.97e-4 -7.95e-5 -0.00168 -0.00899 -0.00667 -0.00121 -0.00685 4.68e-3 0.00110 -4.91e-3 -0.0246
6 -0.00538 -1.07e-2 -7.96e-5 -0.00933 -0.0117 -0.00334 -0.00470 -0.00240 -2.14e-2 0.00605 8.70e-3 -0.0532