Making a “rank-one update” to a matrix means adding to the matrix another matrix of rank one:
This actually occurs quite a bit in statistics and optimisation – I’m interested in the topic because rank-one updates occur in certain applications of Expectation Propagation, but a perhaps more important case is in sequential regression problems. Imagine doing linear regression, but being given the datapoints one-by-one. Because the regression coefficients are given by the famous least-squares equation:
you seem to need to invert the matrix each time you get a new datapoint. This is expensive because matrix inversion is a operation. Thanks to rank-one updates, we can bring that cost down to . At each step if is the current design matrix, our new datapoint is , then the updated version of is:
so that we have a rank-one update to for each new datapoint. There are many ways we can take advantage of that, but I’ll only look at two:
- Updating the inverse of using the matrix inversion lemma (AKA the Woodbury-Sherman-Morrison formula)
- Updating the Cholesky decomposition of
For the first one, say that , and we have already computed the inverse of . The matrix inversion lemma tells us that:
(what this formula does becomes clearer if you imagine that is an eigenvector of A). From a computational point of view, the important point is that we can update the inverse using just matrix products, a division and an addition, which brings the cost down to .
The second solution is often recommended by linear algebra specialists because it has better numerical stability. It is a bit slower in practice, and I haven’t seen it do much of a difference in my own problems, but YMMV. It might become important when the matrices are poorly conditioned. Inverting a positive-definite matrix can be done via the Cholesky decomposition:
where is an upper-triangular matrix. The trick is simply that:
and inverting a triangular matrix is easy, so that the expensive part is to get in the first place.
In R I find it useful to define a cholsolve function as such:
#Compute A\b, given U=chol(A)
cholsolve <- function(U,b)
It is possible to update a Cholesky decomposition following a rank-one update in time, see for example this tech report by Matthias Seeger. A year ago I wrote my own C function for doing that, which managed to be slower than the for most practical values of (a bit of a disappointment). Fortunately, a package called SamplerCompare has a function called chud which is very fast, so that there’s no excuse anymore for not using that trick.