You can build a big similarity matrix that has the correlation between each of n stocks in the index with every other. Then, it's a matter of solving the following problem:
- where ρ(i, j) is the correlation (similarity) between two stocks
- where x(i, j) is binary: 1 if stock i is...