### PROBLEM LINK:

**Author:** Sergey Kulik

**Tester:** Hiroto Sekido

**Editorialist:** Kevin Atienza

### DIFFICULTY:

HARD

### PREREQUISITES:

Tree rotations, Möbius inversion, union-find, sieve

### PROBLEM:

Given an *weighted* unrooted tree with N nodes, find the number of unordered pairs (S,T) of nodes such that the greatest common divisor (GCD) of the weights of the edges in the (unique) path from S to T is equal to 1.

Also, given Q queries, where each query changes the weight of some edge, determine the answer after each query.

### QUICK EXPLANATION:

Let F(t) be the number unique paths whose weights are divisible by t, and let G(t) be the number of unique paths whose weights have GCD equal to t (so that we are looking for G(1)). Then we have:

By Möbius inversion, we get:

In particular,

Thus, we only need to compute F(g) for 1 \le g \le 10^6. Each F(g) can be computed using union-find, and while there are 10^6 F's to compute, it can be shown that the number of *union* operations involved isn’t that large.

To handle the queries, we need to compute the changes in the $F(g)$s. We first collect which edges are not affected by any query, and *unify* them (again using union-find). Then, for the $i$th query, we keep track which among the remaining edges have weights divisible by g, and then perform union-find to compute the change in F(g).

### EXPLANATION:

Let L be the upper bound for the weights of the edges (in the problem, L = 10^6).

Let G be the number of unique paths whose weights have GCD equal to 1. To begin with, one can calculate G in by traversing the tree from some node i and keeping track of the GCD of the weights of edges up to i for every node. Doing this for all nodes i will take O(N) such traversals and therefore O(N^2 \log L) time (The O(\log L) part is for computing GCDs). For the first subtask, this is acceptable.

However, for the queries, one cannot simply do this Q times because it will be slow. Instead, we should find a way to quickly compute the change in G when an edge weight is updated. They key here is to remember that the only paths that will be affected are those that pass through the updated edge. Thus, one only needs to calculate the number of paths passing through the updated edge whose path GCD is 1 *before* and *after* the update. The change in G is the difference between the two. Thus, we only need to compute the number of edges that pass through a given edge with path GCD equal to 1.

Now, given an edge (a,b) with weight w, how many paths pass through the edge with GCD equal to 1? Here are a few observations:

- Every path between two nodes (S,T) is composed of three parts: a path from S to a, the edge (a,b), and a path from b to T.
- The GCD of every path from any S to a can be computed in O(N \log L) time by a similar traversal as described above. The same is true for paths from b to any T.
- The GCD of every path is a divisor of w.

The last observation is particularly important, because it reduces the number of *distinct* GCDs of paths. Indeed, if C_a(d) be the number of paths from S to a whose GCD is d, and C_b(d) be the number of paths from b to T whose GCD is d, then the number of paths passing through (a,b) with GCD equal to 1 is the sum of all C_a(d)C_b(d') where (d,d') ranges across all pairs of *coprime* divisors of w. This should be fast, because there shouldn’t be very many such pairs! So overall, we now have an O(N \log L + \sqrt{w} + k) time algorithm to compute the updated answer, where O(\sqrt{w}) is for factoring w, k is the number of pairs (d,d') above. Doing this for every update should pass subtask 1!

The overall running time is O(N^2 \log L + Q(N \log L + \sqrt{w} + k)).

# Optimizations

Here we describe a few optimizations to the approach, before we describe the solution for subtask 2.

First, we can improve on enumerating the divisors of w by performing a sieve that runs in O(L \log \log L) which computes a prime factor of each number from 1 to L. After that, we can enumerate the divisors of every number in linear time in the number of divisors!

Next, notice that we don’t really care about exponents of prime factors, since we only care about whether the GCD is equal to 1 or not. For example, any number is coprime to 2\cdot 3\cdot 7 if and only if it is coprime to 2^4\cdot 3^5\cdot 7^{11}. Thus, we can simply replace all weights by their radicals, including the updated weights in the queries. The radicals can also be precomputed in O(L \log \log L) using a sieve.

There are advantages of doing so. One is that there are much fewer divisors to consider. Indeed, if c is the maximum number of distinct primes whose product is \le L, then there are at most 2^c divisors! When L = 10^6, we have c = 7, so the maximum number of divisors is 128. We can now also explicitly calculate k above, which is the number of pairs of coprime divisors (d,d'): it is simply 3^c. Finally, it is also possible to compute the GCD of two numbers (a,b) in O(c) time instead of O(\log L) using the following algorithm:

- Enumerate all prime divisors of a.
- For each such prime divisor, check if it also divides b.
- Return the product of all such primes.

Finally, when enumerating the pairs (d,d') above, it is possible to avoid calling the GCD subroutine algorithm if one enumerates the divisors of w in a canonical way: For example, if the p_1, \ldots p_c are the prime divisors of w, then the $i$th divisor, 0 \le i < 2^c, should be the product of the primes p_j where the $j$th bit of i is on. For example, the canonical order of the divisors of 70 is (1,2,5,10,7,14,35,70). Using this ordering, the $i$th and $j$th divisor are coprime if and only if the bitwise AND of i and j is zero!

Using these, the running time reduces to O(cN^2 + Q(cN + 3^c)).

# Computing the initial G in O(2^c N \log N) time.

Let us return our attention to the computation of the initial G. Remember that it is the number of paths whose GCD is equal to 1.

Let’s root a tree at some node r. Then there are three types of paths in the tree:

- Paths where one endpoint is r.
- Paths whose endpoints belong to the same child of r.
- Paths whose endpoints belong to two different children of r.

Let’s try to enumerate each type separately. First, paths where one endpoint is r can be computed quickly using a similar traversal as above, so it should take O(cN) time. Next, paths whose endpoints belong to the same child of r can be computed by recursively doing this on every child of r. So all that remains are paths whose endpoints belong to two different children of r.

Each such path (S,T) consists of two parts: a path from S to r and a path from r to T, so we can precompute all GCDs from r to all the other nodes in O(cN) time as above. Specifically, for each node i, we associate a number g_i, which is the GCD of the path from node i to r. And then we group the numbers from the same child of r together. The problem now becomes: find the number of coprime pairs (g_i,g_j) that belong to different groups!

Suppose we have a method to compute the number of coprime pairs (g_i,g_j), regardless of whether they are in different groups or not. Then we can use this method multiple times to compute the number of such coprime pairs in *different* groups. To do so, first compute the number of coprime pairs (g_i,g_j) regardless of group (by using the method), and then subtract all the coprime pairs that belong to the same group (by using the method on every group). Thus, what remains is to describe the method itself.

An inclusion-exclusion argument can now be used:

- At first, we count all the pairs.
- Next, we subtract those pairs whose path GCDs are divisible by at least one prime. By doing so, the only pairs that remain counted are those whose GCD is 1.
- Unfortunately, by doing so, we have
*double-subtracted*the pairs whose GCD are divisible by at least*two*primes! Therefore, we need to add those back. - Unfortunately, by doing so, we have again
*counted*the pairs whose GCD are divisible by at least three primes. Therefore, we need to subtract those. - And so on.

Thus, we need to be able to compute, for each number d, the number of $g_i$s divisible by d. This can be done by factorizing each g_i and placing them into buckets corresponding to each of its divisors.

Since each number has at most 2^c divisors, this should take only O(2^c N) time. However, this still might take quadratic time due to recursion in case the tree is degenerate. The trick here is to *select the root* of the tree as the node that minimizes the size of the maximum child! Such a node can be found in O(N) time by traversing the tree and keeping track of the sizes of each neighboring subtree, and when such a node is selected as root, it can be shown that the size of the children is at most half the size of the original tree! Therefore, the recursion is only at most \log N levels deep, and the overall algorithm takes O(2^c N \log N) time!

Finally, one can sometimes speed up this step if one can find find an *edge* that splits the tree roughly evenly (say, in a ratio between 1/3 and 2/3). In this case, there are only two types of paths: those that are on one side of the edge, and those that pass through the edge. The former can be computed recursively, while the latter has already been described in the previous section and runs in O(cN + 3^c) time. This is usually faster than an O(2^c N) step, but unfortunately we can’t guarantee the existence of such an edge (for example, consider a *star graph*), so sometimes one doesn’t have a choice but to revert to the O(2^c N)-time step above.

Finally, if all these are implemented well, one might be able to *squeeze* their program into the time limit! However, the best implementation of the editorialist takes 1.92 seconds in the worst case, so this is probably too close for comfort. Instead, we will describe an elegant solution below.

# Inclusion-exclusion, again

To obtain a good fast solution, we will expand on the inclusion-exclusion argument above on the whole tree. In general, this technique is usually applicable whenever you encounter a counting problem involving gcd’s.

Let F(g) be the number unique paths whose weights are divisible by g. Then one can express G in terms of the F(g) as follows (proceeding similarly as above):

- At first, we count all the paths. There are F(1) of them.
- Next, we subtract those pairs whose path GCDs are divisible by at least one prime. This is equivalent to F(2) + F(3) + F(5) + \cdots. By doing so, the only paths that remain counted are those whose GCD is 1.
- Unfortunately, by doing so, we have
*double-subtracted*the paths whose GCD are divisible by at least*two*primes! Therefore, we need to add those back. - Unfortunately, by doing so, we have again
*counted*the paths whose GCD are divisible by at least three primes. Therefore, we need to subtract those. - And so on.

Thus, we will need to compute the F(g) for all 1 \le g \le L, and then afterwards G can be computed quickly! We will describe that below, but first we mention that the above can be written more concisely as the following:

where \mu(g) is the Möbius function. The $\mu$s can be computed in O(L \log \log L) time again using a sieve, so all that remains really is to compute all the $F(g)$s.

# Computing F(g)

For a fixed g, one can easily compute F(g) by means of a union-find data structure. First, consider all edges whose weights are divisible by g, and consider the graph X_g that contains only these edges. Then F(g) is simply the number of pairs of nodes in X_g that belong to the same connected component in this graph! Thus, if one uses a union-find data structure to keep track of the sizes of all connected components, then one can compute F(g) easily.

However, remember that there are a total of L $F(g)$s to compute, and there are N nodes, so implementing this naïvely will take O(LN\alpha(N)) time, so it’s not acceptable. The trick here is that the number of divisors of the weight of any edge is at most 2^c, and so each edge will only be considered in at most 2^c $X_g$s. Furthermore, connected components of size 1 are not important for our purposes, therefore the number of relevant nodes for each X_g is at most twice the number of edges in it. Since there are O(2^c N) edges in total, therefore the number of *make set* and *union* operations is at most O(2^c N) time, and it is possible to finish the task in O(2^c N \alpha(N)) time!

Finally, we still need to be able to update the answer after each query. We will describe that below.

# Updating the $F(g)$s

First, note that when we update the weight of an edge, it will possibly be added in some graph $X_g$s and removed in some others. But the union-find data structure only supports fast insertions, not deletions, so we will need to do something else.

The trick is to notice that there are only a few queries, Q, so there are only at most Q edges that will be changed. All the others will remain in their respective $X_g$s. Let’s call those Q edges (at most) the *volatile edges*, and the rest *tame edges*. Thus, the first thing we can do is to preprocess the tame edges with union-find, since they will not change as we process the queries. This leaves us with graphs Y_g, 1 \le g \le L, which are similar to X_g but involves only the tame edges.

The next thing to do is to create a new graph, C_g, which has a single node for every connected component of Y_g. By definition, C_g will contain no edges, but soon it will have some edges as we process the queries. Each node of C_g must also remember the size of the component it’s representing.

And now we process the queries. For the $i$th query, we consider all the queries up to the $i$th query to compute the weights of all the volatile edges at that point. After that, we perform union-find on each of the C_g graphs by processing all the volatile edges. After doing so, we should be able to compute the F(g) after the $i$th query. Finally, to compute the new G, we only consider those $F(g)$s that have changed, since there are only at most O(2^c Q) of those!

How fast does this run? First, note that there are only at most 2^c\cdot 2Q *relevant* nodes we need from all the C_g, i.e. only those that are adjacent to volatile nodes. Therefore, doing union-find on all of those takes O(2^c\cdot Q \alpha(N)) time!

Overall, the algorithm just described runs in O(2^c(N+Q^2)\alpha(N)) time, and the solution now passes comfortably within the time limit of the problem!

### Time Complexity:

O(L \log \log L + 2^c\cdot(N + Q^2)\alpha(N)), where:

- \alpha is the inverse Ackermann function
- L is the upper bound for the weights of the edges.
- c is the maximum number of distinct primes whose product is \le L. For L = 10^6, we have c = 7.