You are not logged in. Please login at to post your questions!


STARROAD - Editorial



Author: Misha Chorniy
Tester: Kevin Atienza
Translators: Sergey Kulik (Russian), Team VNOI (Vietnamese) and Hu Zecong (Mandarin)
Editorialist: Kevin Atienza




Dynamic programming, matrix exponentiation, Kirchhoff's matrix-tree theorem, Chinese remainder theorem, Garner's algorithm, big integers


There are $N$ stars, and $M$ bidirectional space roads in space that connect pairs of stars together. Travelling through one space road takes 1 day.

Chef has spaceships. Each spaceship is designated two stars, and a fixed route between these stars. A route is a path that is traversable within at most $K$ days. Roads may be traversed any number times. Also, there is a spaceship for every possible route between stars.

The Empire deployed a weapon capable of destroying exactly $X$ spaceships that travel between the same pair of stars. The Empire used the weapon as many times as possible, so after some time, there aren't enough groups of spaceships any more to destroy.

Chef decided to retire most of the surviving spaceships. Chef wants to retire as many spaceships as possible, but Chef wants to do this such that there is still a way to get from any star to another using spaceships. How many ways are there for Chef to retire spaceships so that this is satisfied?


The answer is the number of spanning trees of a related (multi)graph, obtained by the following construction:

  • Construct a graph with the same set of $N$ nodes.
  • For each pair of nodes $(i,j)$, add $P(i,j) \bmod X$ edges, where $P(i,j)$ is the number of paths of length at most $K$ between $i$ and $j$.

To compute $P(i,j)$: Let $G$ be the adjacency matrix of the input graph. Compute the matrix $M = G^K + G^{K-1} + \ldots + G^2 + G^1 + G^0$. Then $P(i,j) = M_{i,j}$. Note that $M$ can be computed in $O(\log K)$ time using a version of "fast exponentiation".

Finally, to compute the number of spanning trees of a graph, you can use Kirchhoff's matrix-tree theorem. It says that the number of spanning trees can be obtained as the determinant of a specific matrix related to the graph.

Unfortunately, the number of spanning trees can be very large, and may not fit a 64-bit variable. One idea is to use big ints and fractions to compute the determinant, but this is slow. Instead, a better solution is to compute the determinant modulo ~150 primes around $10^9$, and reconstructing the full answer using the Chinese remainder theorem. (Garner's algorithm can be used for this.)


Number of spaceships

The first step for our solution is to compute the number of spaceships between each pair of stars $(i,j)$. This is equal to the number of paths between $i$ and $j$ with length at most $K$. Let's denote this number by $P(i,j)$. We also denote by $P_k(i,j)$ the number of paths between $i$ and $j$ with length exactly $k$. Thus, we have the relationship: $$P(i,j) = P_0(i,j) + P_1(i,j) + P_2(i,j) + \ldots + P_K(i,j)$$ Now, what is $P_k(i,j)$? We can use dynamic programming to figure this one. Consider a path of length $k$ from $i$ to $j$. Let $t$ be the second-to-last node visited. Then the path consists of two parts:

  • The path of length $k-1$ from $i$ to $t$.
  • The path of length $1$ from $t$ to $j$. In other words, $t$ and $j$ must be adjacent.

Thus, the number of paths of length $k$ from $i$ to $j$ whose second-to-last node is $t$ is equal to $P_{k-1}(i,t)$! Summing across all possible values of $t$, we get: $$P_k(i,j) = \sum_{\text{$t$ adjacent to $j$}} P_{k-1}(i,t)$$

For the base case, we can have $k = 0$. In this case, the path is of length $0$, so $P_0(i,j) = 1$ if $i = j$, and $P_0(i,j) = 0$ if $i \not= j$.

Thus, we now have a dynamic programming way to compute $P_k(i,j)$ for every $k$! Unfortunately, $K$ is very large, so this solution isn't feasible.

We can restate the recurrence above in a different way. Let $G$ be the $N\times N$ adjacency matrix of the graph, i.e., $G(i,j) = 1$ if $i$ and $j$ are adjacent, and $G(i,j) = 0$ otherwise. Then the recurrence above is equivalent to: $$P_k(i,j) = \sum_{t=1}^N P_{k-1}(i,t)G(t,j)$$ However, if we view each $P_k$ as an $N\times N$ matrix itself, then this is just the equation of matrix multiplication! In other words, the above is equivalent to $$P_k = P_{k-1}\cdot G$$ where $\cdot$ denotes matrix multiplication. Also, based on the above, $P_0$ must be the $N\times N$ identity matrix (denoted $I_N$). Thus, we have:

  • $P_0 = I_N$
  • $P_1 = P_0 \cdot G = G$
  • $P_2 = P_1 \cdot G = G\cdot G = G^2$
  • $P_3 = P_2 \cdot G = G^2\cdot G = G^3$
  • $P_4 = P_3 \cdot G = G^3\cdot G = G^4$
  • $P_5 = P_4 \cdot G = G^4\cdot G = G^5$
  • etc.

More generally, we have $P_k = G^k$, where $G^k$ denotes the matrix $G$ raised to the power $k$! This is certainly an elegant formulation of the above, but more importantly, it gives us a faster way of computing $P(i,j)$. Notice that if we also view $P$ as an $N\times N$ matrix, then the relationship above is equivalent to $$\begin{align*} P &= P_0 + P_1 + P_2 + \ldots + P_K \\\ &= G^0 + G^1 + G^2 + \ldots + G^K \end{align*}$$ This sum looks very much like a geometric series, so it seems possible that a fast formula for it exists. Unfortunately, the standard formula $1 + x + x^2 + \ldots + x^k = \frac{x^{k+1} - 1}{x - 1}$ doesn't work, because we will have to define how to divide matrices, and even if it is possible, $G - I_N$ might not be invertible. This sucks because with such a formula, we could possibly compute the answer using only $O(\log k)$ multiplications with fast exponentiation. Thankfully, there is another way to compute this series using $O(\log k)$ multiplications without doing any division.

Let's define $F(k) = G^0 + G^1 + G^2 + \ldots + G^{k-1}$. Thus, $P = F(K+1)$, so we can focus on computing $F(k)$. Recursively, this can be computed as: $$\begin{align*} F(k) &= G^0 + G^1 + G^2 + \ldots + G^{k-1} \\\ &= G^0 + G(G^0 + G^1 + \ldots + G^{k-2}) \\\ &= I_N + G\cdot F(k-1) \end{align*}$$ But the more important insight is the following relationship between $F(2l)$ and $F(l)$: $$\begin{align*} F(2l) &= G^0 + G^1 + G^2 + \ldots + G^{2l-1} \\\ &= G^0 + G^1 + G^2 + \ldots + G^{l-1} + G^l + G^{l+1} + \ldots + G^{2l-1} \\\ &= (G^0 + G^1 + G^2 + \ldots + G^{l-1}) + G^l(G^0 + G^1 + \ldots + G^{l-1}) \\\ &= (I_N + G^l)(G^0 + G^1 + G^2 + \ldots + G^{l-1}) \\\ &= (I_N + G^l)\cdot F(l) \end{align*}$$ This relation gives us the following "fast exponentiation" analog algorithm to compute $F(k)$:

  • If $k = 0$, return $O_N$. (the zero matrix)
  • If $k$ is odd, compute $F(k) = I_N + G\cdot F(k-1)$.
  • If $k$ is even, compute $F(k) = (I_N + G^{k/2})\cdot F(k/2)$.

Unfortunately, this doesn't quite run in $O(\log k)$ multiplications because we will need to compute $G^{k/2}$ every time. Computing $G^{k/2}$ using normal fast exponentiation every time will give us $O(\log^2 k)$ multiplications. But it can actually be reduced to just $O(\log k)$ by computing $G^k$ alongside $F(k)$. More specifically, instead of computing just $F(k)$, we compute $F(k)$ and $G^k$:

  • If $k = 0$, then $F(k) = O_N$ and $G^k = I_N$.
  • If $k$ is odd, first compute $F(k-1)$ and $G^{k-1}$. Then $F(k) = I_N + G\cdot F(k-1)$ and $G^k = G\cdot G^{k-1}$.
  • If $k$ is even, first compute $F(k/2)$ and $G^{k/2}$. Then $F(k) = (I_N + G^{k/2})\cdot F(k/2)$ and $G^k = G^{k/2}\cdot G^{k/2}$.

This now uses just $O(\log k)$ multiplications!

Since naïve matrix multiplication runs in $O(N^3)$ time, the overall running time of computing $P := F(K+1)$ is then $O(N^3 \log K)$.

Number of ways to retire spaceships

After building the spaceships, the Empire's weapon now destroys most of them. Specifically, every group of $X$ spaceships travelling between the same pair of stars will be destroyed. This simply means you should reduce each $P(i,j)$ modulo $X$!

At this point, Chef now wants to retire as many spaceships as possible such that it is still possible to get from any star to any other star.

Given a multigraph, what is the maximum number of edges you can remove so that it is still connected? We can rephrase this as: Given a multigraph, what is the minimum number of edges you can keep so that the graph is connected? The latter version is much easier to answer: The answer is simply $N-1$, and the remaining edges must form a tree, actually a spanning tree. Thus, the answer to this problems is the number of spanning trees of the multigraph where there are $P(i,j) \bmod X$ edges between every pair of nodes $(i,j)$!

Counting spanning trees is multigraphs is a standard problem which can be solved by the well-known Kirchhoff's matrix-tree theorem. It says that the number of spanning trees can be obtained as a determinant of a specific matrix:

  • Construct a new $N\times N$ matrix $L$, called the Laplacian matrix, as follows: $L_{i,i}$ is the degree of node $i$, and if there are $k$ edges between $i$ and $j$, then $L_{i,j} = -k$.
  • Remove the last row and column of the matrix, leaving an $(N-1)\times (N-1)$ matrix.
  • The number of spanning trees is the determinant of this matrix.

This is quite elegant in my opinion, and its proof is actually accessible/understandablem. A proof outline is given in its Wikipedia page.

Converting this into an algorithm is quite straightforward. $L$ itself is quite easy to construct. The real challenge lies in computing the determinant. One can usually do this in $O(N^3)$ with Gaussian elimination. However, we soon encounter a few problems when doing Gaussian elimination:

  • The answer can be very large. Indeed, It can be proven that the maximum possible answer is $(X-1)^{N-1}\cdot N^{N-2}$. (Note that $N^{N-2}$ is the number of spanning trees in a complete graph of $N$ nodes.) It means we will need to represent the coefficients using big ints.
  • What's more, the coefficients are not guaranteed to be integers! This means we may have to represent the coefficients as fractions of big ints. Normal Gaussian elimination avoids this by representing the coefficients as floating-point numbers, but in this problem, we need the exact answer, so we can't do that.

This means that the running time is not just $O(N^3)$, but $O(N^5)$ (assuming $N$ and $X$ fits in a machine word), because our intermediate numbers will have $O(N)$ digits, and we will need to multiply them in $O(N^2)$ time. This might be very slow for the final subtask. I tried implementing it but I didn't get it accepted no matter how hard I tried. Maybe you were able to get this method accepted, but instead we will show you a faster way.

Computing the big answer

Note that the problem will be much easier if the problem only asked for the answer modulo some prime $p$, say the usual $10^9 + 7$. We can perform Gaussian elimination in the world "modulo $p$" just fine, and what we get will be the determinant of the matrix modulo $p$. This works because division by a nonzero number modulo $p$ is possible and well-defined. This is not true when the modulus is not prime. Another advantage of only asking the answer modulo $10^9 + 7$, is that we don't need bigints to compute the answer!

Now, the problem asks for the full answer. But suppose we know that the answer is less than $10^9$. Note that Gaussian multiplication doesn't guarantee that intermediate values are smaller than the final answer, so you'll probably still need big ints and fractions to do it. But instead, we can take advantage of the fact that if $x < 10^9$, then $x \bmod (10^9 + 7) = x$, so we can work modulo $10^9 + 7$ instead! This saves us the need to use big ints and fractions.

Note that we can adapt this method as long as we can put an upper bound to the answer. Specifically, suppose we know that the answer is less than some upper bound, $U$. Then we can select any modulus $M > U$ (not necessarily prime) and just work modulo $M$, and the answer will still be correct! Unfortunately, in the current problem, the upper bound can be very, very large. Our best upper bound so far is $7777776^{99}\cdot 100^{98}$, and it has $879$ digits. Sure, we can find a prime number with more than $879$ digits and work modulo that prime, but this is still slow, because multiplying numbers that large is slow! (And actually, the running time is still $O(N^5)$.)

The solution is to instead work modulo many small primes $p_1, p_2, \ldots, p_m$ and then combine the results using Chinese remainder theorem. Specifically,

  • Select a bunch of primes $p_1, p_2, p_3, \ldots, p_m$ such that the product $p_1\cdot p_2\cdot p_3\cdots p_m$ has more than $879$ digits. Ideally, each $p_i$ must be around $10^9$ so we can work modulo $p_i$ without needing big ints.
  • Compute the determinant of the matrix modulo each $p_i$.
  • Use the Chinese remainder theorem to compute the determinant modulo $p_1\cdot p_2\cdot p_3\cdots p_m$. Since this modulus is greater than our upper bound, the result is now the full determinant!

Well, this works, but we still need to outline how to do that last step, which will inevitably require bigints. We will describe one way to do it in $O(m^2)$ time later on.

Now, how fast does this run? Note that we're not multiplying big numbers anymore, except in the last step, so each determinant is computed in actual $O(N^3)$ time. Thus, the running time is $O(mN^3 + m^2)$. But since the final result has $O(N)$ digits, we can select the primes such that $m = O(N)$. Thus, the whole running time is $O(N^4)$, which is at least an order of magnitude faster than $O(N^5)$!

All that remains is to describe that last step.

Chinese remainder theorem

We have $m$ congruence relations, where the $i$th relation is $$x \equiv r_i \pmod{p_i}$$ where the $p_i$s are distinct primes. What we want is to compute $x$ modulo $p_1\cdot p_2\cdot p_3\cdots p_m$. The Chinese remainder theorem guarantees the uniqueness of this value.

The basic version of the Chinese remainder theorem involves only two congruences. Suppose we want to find $x$ modulo $m_1$ and $m_2$, given that $$x \equiv a_1 \pmod{m_1}$$ and $$x \equiv a_2 \pmod{m_2}$$ and $m_1$ and $m_2$ are coprime. Then the theorem says that: $$x \equiv a_1 n_2 m_2 + a_2 n_1 m_1 \pmod{m_1m_2}$$ where $n_2$ is the inverse of $m_2$ modulo $m_1$, and $n_1$ is the inverse of $m_1$ modulo $m_2$. For more than two congruences, repeatedly apply this version.

But since we're talking about lots of primes $p_i$, this means that if we apply the theorem naïvely we will need to compute inverses of numbers modulo some big integers, and that sounds quite difficult. Fortunately, there's another way to combine congruences without this inverse computation. This is called Garner's algorithm, and we'll show how it works for the case $m = 4$, i.e. there are four congruence relations, and we want to compute $x$ modulo $p_1p_2p_3p_4$.

Notice that we can write $x$ modulo $p_1p_2p_3p_4$ uniquely in the form $$x \equiv x_1 + p_1x_2 + p_1p_2x_3 + p_1p_2p_3x_4 \pmod{p_1p_2p_3p_4},$$ where $0 \le x_i < p_i$ for every $i$. Thus, our goal now is to compute these $x_i$s. We can compute them one by one:

  • Reducing the congruence modulo $p_1$, we find that $r_1 \equiv x_1 \pmod{p_1}$, so we now have $x_1$.
  • Reducing the congruence modulo $p_2$, we find that $r_2 \equiv x_1 + p_1x_2 \pmod{p_2}$, or $x_2 \equiv (r_2 - x_1)p_1^{-1} \pmod{p_2}$, so we now have $x_2$.
  • Reducing the congruence modulo $p_3$, we find that $r_3 \equiv x_1 + p_1x_2 + p_1p_2x_3 \pmod{p_3}$, or $x_3 \equiv (r_3 - x_1 - p_1x_2)(p_1p_2)^{-1} \pmod{p_3}$, so we now have $x_3$.
  • Reducing the congruence modulo $p_4$, we find that $r_4 \equiv x_1 + p_1x_2 + p_1p_2x_3 + p_1p_2p_3x_4 \pmod{p_4}$, or $x_4 \equiv (r_4 - x_1 - p_1x_2 - p_1p_2x_3)(p_1p_2p_3)^{-1} \pmod{p_4}$, so we now have $x_4$.

Notice that nowhere in this solution did we need to use big ints whatsoever! We will only need big ints to compute $x$ itself from $$x \equiv x_1 + p_1x_2 + p_1p_2x_3 + p_1p_2p_3x_4 \pmod{p_1p_2p_3p_4},$$ but here, the most complex operation we need is just multiplication! We don't even need to implement full-fledged big-int multiplication, because each multiplicand is around $10^9$, so we just need to implement multiplication by a "small" number! This is the main advantage of Garner's algorithm.

The running time of this algorithm is $O(m^2)$, as long as we write the Garner relation as $$x \equiv x_1 + p_1(x_2 + p_2(x_3 + p_3(x_4)))$$

Time Complexity:

$O(N^3 \log K + N^4)$



This question is marked "community wiki".

asked 10 May '16, 05:32

kevinsogo's gravatar image

accept rate: 11%

edited 17 May '16, 23:34

"But since we're talking about lots of primes pipi, this means that we will need to compute inverses of numbers modulo some big integers"

No, the inverses are computed modulo the individual primes, which are around $10^9$, so it's easy to do without bigints. The bigints are only needed for final multiplication/sum/moduling, and moduling can be done by almost brute force, since the result of multiplication/sum isn't much larger than the modulo / product of primes.


answered 17 May '16, 04:17

xellos0's gravatar image

accept rate: 10%


Good point. I modified the text slightly by adding "if we apply the theorem naïvely".

(17 May '16, 23:35) kevinsogo7★

It's not related to the topic, but something strange happened with May Challenge 2016 rankings.. I solved 8 problems, but still got rating loss..


answered 17 May '16, 15:25

meirambek's gravatar image

accept rate: 0%

@admin tagging admin

(18 May '16, 16:23) mightymercado4★
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text]( "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported
  • mathemetical formulas in Latex between $ symbol

Question tags:


question asked: 10 May '16, 05:32

question was seen: 2,229 times

last updated: 18 May '16, 16:23