Linear recurrence using Cayley-Hamilton theorem

If you are not interested in linear recurrences, or are already aware of Cayley-Hamilton theorem, you can probably stop reading now.

Suppose we want to solve the following recurrence:
(T[0], T[1], T[2]) = (1, 2, 3), and
T[n] = T[n - 1] + 3 T[n - 2] + 8 T[n - 3], for n >= 3.

The usual way to solve such recurrence is via matrix exponentiation. We create a matrix M

```    1 3 8
M = 1 0 0
0 1 0

v = 3 2 1
```

In order to find T[n] we compute Mn - 2, and take the dot product of the first row of M with the vector v. This gives us an O (k^3 lg n) algorithm assuming that the size of the matrix was (k x k), and we are using the most elementary algorithm for matrix multiplication.

Cayley-Hamilton Theorem: Any square matrix is a solution of its characteristic polynomial. In our example, the characteristic polynomial is:

```| (x - 1)   -3   -8 |
|   -1       x    0 |
|    0      -1    x |

= x^3 - x^2 - 3x - 8
```

Note that, the characteristic polynomial looks very similar to the recurrence we want to solve. This is no coincidence, it can be shown that the characteristic polynomial of the matrix M corresponding to a linear recurrence looks the same as the original recurrence. So we do not need to calculate the symbolic determinant of the matrix, but we can use the recurrence to compute this polynomial.

Now, according to Cayley-Hamilton theorem:
M3 - M2 - 3M - 8I = 0, i.e.,
M3 = M2 + 3M + 8I
Here I is the identity matrix.

Let’s calculate some higher powers of M.
M4 = M (M2 + 3M + 8I) = M3 + 3M2 + 8M
= (M2 + 3M + 8I) + 3M2 + 8M = 4M2 + 11M + 8I

Similarly,
M5 = 15M2 + 20M + 32I

One can observe that for any n, the matrix Mn can be represented as a linear combination of M2, M and I. Also, if we know this representation for Mn, we can easily calculate the representation for Mn+1.

Mn = aM2 + bM + cI
==> Mn + 1 = M (aM2 + bM + cI) = (a + b) M2 + (3a + c) M + 8aI

For a (k x k) matrix, the characteristic polynomial will be of degree k, and the above step can be performed in O (k) time, i.e., we can compute the representation of Mn+1 from Mn in O (k) time.

Now, let us see if we can calculate the representation of M2n from the representation of Mn.

M2n = (aM2 + bM + cI ) (aM2 + bM + cI )
= a2 M4 + 2ab M3 + (b2 + 2ac) M2 + 2bc M + c2 I

We can replace M4 and M3 by their corresponding representation that we have already calculated. This means, in order to compute M2n from Mn, we need to multiply two degree k polynomials, and then replace M2k -2, M2k - 3, …, Mk by their representation. This can be done in O (k^2) time using elementary methods.

Now, we know how to compute the representation of:

1. Mn+1 from Mn, and
2. M2n from Mn

Both steps have O (k^2) complexity. Since these are the only steps needed for binary exponentiation algorithm, we can compute the representation of Mn in O (k^2 lg n) time.

Now, to compute the actual matrix Mn, we still need to add these k matrices which are present in its representation. However, if you notice, we do not need the whole matrix Mn, but only its first row, which again can be computed in O (k^2) time.

Computing the first row of M2, M3, …, Mk - 1 is not difficult either, and can be done is O (k^2) time.

This gives us an O (k^2 lg n) algorithm to solve the linear recurrence, which is faster by a factor k, compared to matrix exponentiation method. The difference probably will not make much difference if k is small. However, for large matrices the difference is significant, e.g., project euler 258

35 Likes

Nice explanation !!!

What an amazing explanation indeed

Thank you so, so much @djdolls

This has given me motivation to re-try project euler 258

Thanks!

Also, if I have any troubles, would you mind if I asked you for some pointers?

Best,

Bruno

Wow! I always wanted to know a way to calculate recurrence relations fast.

Thanks so much.

Thanks for this well-explained editorial!

Can you please explain how to get the first rows of M^2, \cdots, M^{k-1} in O(k^2)?

Thanks a lot.

Thanks for this very good explanation! I just used this a couple of days ago to solve the hard problem in Hackerearth’s June’17 Circuits contest.

1 Like

Thanks!

Feel free to ask if something is not clear, I will be happy to help.

I got it. We can just multiply the top row of M^i with M in O(k) time instead of O(k^2) as each column only has 2 entries.

How did you find this?

1 Like

Doesnt each column have k entries? Can you please explain the complexity part again, because I think it would be O(k^{2}) for finding M^{i} , so for finding top rows of each M^{i} the complexity would become O(k^{3})