The standard Lanczos method for computing matrix functions has a brutal memory requirement: storing an n × k n \times k n×k basis matrix that grows with every iteration. For a 500.000 500.000 500.000-variable problem needing 1000 1000 1000 iterations, that’s roughly 4 GB just for the basis.
In this post, we will explore one of the most straightforward solutions to this problem: a two-pass variant of the Lanczos algorithm that only requires O ( n ) O(n) O(n) memory at the cost of doubling the number of matrix-vector products. The surprising part is that when implemented carefully, the two-pass version isn’t just memory-efficient—it can be faster for certain problems. We will dig into why.
All code is available on GitHub: two-pass-lanczos
The full technical report with proofs and additional experiments: report.pdf
Table of Contents
Computing Matrix Functions
Let’s consider the problem of computing the action of matrix functions on a vector:
x = f ( A ) b \mathbf{x} = f(\mathbf{A})\mathbf{b} x = f ( A ) b
where A \mathbf{A} A is a large sparse Hermitian matrix and f f f is a matrix function defined on the spectrum of A \mathbf{A} A. This is a problem that appears pretty often in scientific computing: solving linear systems corresponds to f ( z ) = z − 1 f(z) = z^{-1} f(z)=z−1, exponential integrators for PDEs use f ( z ) = exp ( t z ) f(z) = \exp(tz) f(z)=exp(tz), and many other problems require functions like f ( z ) = z − 1 / 2 f(z) = z^{-1/2} f(z)=z−1/2 or f ( z ) = sign ( z ) f(z) = \text{sign}(z) f(z)=sign(z).
Indeed, there are a lot problems with computing f ( A ) f(\mathbf{A}) f(A) directly. First of all, even if A \mathbf{A} A is sparse, f ( A ) f(\mathbf{A}) f(A) is generally dense. Storing it explicitly is out of the question for large problems. Even if we could store it, computing it directly would require algorithms like the Schur-Parlett method that scale as O ( n 3 ) O(n^3) O(n3), which is impractical for large n n n.
... continue reading