There’s a neat result on matrix decompositions in a paper by Schirru et al. fresh from the arxiv (here). Take two square matrices, **A **and **B**. Assume that the matrices commute . Then they have the same eigenvectors, because if

then:

but also

This means that if **v **is an eigenvector of **A**, then **Bv** is an eigenvector of **A** too. This is only possible if **A** and **B** have the same eigenvectors, because **B **is not allowed to map **v **to something that is not an eigenvector of **A**.

What this means for computation is that once you have the eigendecomposition of **A**, you can get that of **B **for cheap.

If:

then

and we got **U** from the eigendecomposition of **A**, all we need now is:

and we now that the matrix is diagonal so that we don’t need to compute explictly every element.

Schirru et al. apply this result to the kernel matrix in Gaussian process regression, and use this to speed up inference over kernel hyperparameters. It doesn’t solve the initial problem of having to compute the eigendecomposition of a large matrix, but it is very useful nonetheless.