PROBLEM LINK:Author: Sergey Kulik DIFFICULTY:HARD PREREQUISITES:Tree rotations, Möbius inversion, unionfind, sieve PROBLEM:Given an weighted unrooted tree with $N$ nodes, find the number of unordered pairs $(S,T)$ of nodes such that the greatest common divisor (GCD) of the weights of the edges in the (unique) path from $S$ to $T$ is equal to $1$. Also, given $Q$ queries, where each query changes the weight of some edge, determine the answer after each query. QUICK EXPLANATION:Let $F(t)$ be the number unique paths whose weights are divisible by $t$, and let $G(t)$ be the number of unique paths whose weights have GCD equal to $t$ (so that we are looking for $G(1)$). Then we have: $$F(t) = \sum_{g\ge 1} G(gt)$$ By Möbius inversion, we get: $$G(t) = \sum_{g\ge 1} \mu(g) F(gt)$$ In particular, $$G(1) = \sum_{g\ge 1} \mu(g) F(g)$$ Thus, we only need to compute $F(g)$ for $1 \le g \le 10^6$. Each $F(g)$ can be computed using unionfind, and while there are $10^6$ $F$'s to compute, it can be shown that the number of union operations involved isn't that large. To handle the queries, we need to compute the changes in the $F(g)$s. We first collect which edges are not affected by any query, and unify them (again using unionfind). Then, for the $i$th query, we keep track which among the remaining edges have weights divisible by $g$, and then perform unionfind to compute the change in $F(g)$. EXPLANATION:Let $L$ be the upper bound for the weights of the edges (in the problem, $L = 10^6$). Let $G$ be the number of unique paths whose weights have GCD equal to $1$. To begin with, one can calculate $G$ in by traversing the tree from some node $i$ and keeping track of the GCD of the weights of edges up to $i$ for every node. Doing this for all nodes $i$ will take $O(N)$ such traversals and therefore $O(N^2 \log L)$ time (The $O(\log L)$ part is for computing GCDs). For the first subtask, this is acceptable. However, for the queries, one cannot simply do this $Q$ times because it will be slow. Instead, we should find a way to quickly compute the change in $G$ when an edge weight is updated. They key here is to remember that the only paths that will be affected are those that pass through the updated edge. Thus, one only needs to calculate the number of paths passing through the updated edge whose path GCD is $1$ before and after the update. The change in $G$ is the difference between the two. Thus, we only need to compute the number of edges that pass through a given edge with path GCD equal to $1$. Now, given an edge $(a,b)$ with weight $w$, how many paths pass through the edge with GCD equal to 1? Here are a few observations:
The last observation is particularly important, because it reduces the number of distinct GCDs of paths. Indeed, if $C_a(d)$ be the number of paths from $S$ to $a$ whose GCD is $d$, and $C_b(d)$ be the number of paths from $b$ to $T$ whose GCD is $d$, then the number of paths passing through $(a,b)$ with GCD equal to 1 is the sum of all $C_a(d)C_b(d')$ where $(d,d')$ ranges across all pairs of coprime divisors of $w$. This should be fast, because there shouldn't be very many such pairs! So overall, we now have an $O(N \log L + \sqrt{w} + k)$ time algorithm to compute the updated answer, where $O(\sqrt{w})$ is for factoring $w$, $k$ is the number of pairs $(d,d')$ above. Doing this for every update should pass subtask 1! The overall running time is $O(N^2 \log L + Q(N \log L + \sqrt{w} + k))$. OptimizationsHere we describe a few optimizations to the approach, before we describe the solution for subtask 2. First, we can improve on enumerating the divisors of $w$ by performing a sieve that runs in $O(L \log \log L)$ which computes a prime factor of each number from $1$ to $L$. After that, we can enumerate the divisors of every number in linear time in the number of divisors! Next, notice that we don't really care about exponents of prime factors, since we only care about whether the GCD is equal to $1$ or not. For example, any number is coprime to $2\cdot 3\cdot 7$ if and only if it is coprime to $2^4\cdot 3^5\cdot 7^{11}$. Thus, we can simply replace all weights by their radicals, including the updated weights in the queries. The radicals can also be precomputed in $O(L \log \log L)$ using a sieve. There are advantages of doing so. One is that there are much fewer divisors to consider. Indeed, if $c$ is the maximum number of distinct primes whose product is $\le L$, then there are at most $2^c$ divisors! When $L = 10^6$, we have $c = 7$, so the maximum number of divisors is $128$. We can now also explicitly calculate $k$ above, which is the number of pairs of coprime divisors $(d,d')$: it is simply $3^c$. Finally, it is also possible to compute the GCD of two numbers $(a,b)$ in $O(c)$ time instead of $O(\log L)$ using the following algorithm:
Finally, when enumerating the pairs $(d,d')$ above, it is possible to avoid calling the GCD subroutine algorithm if one enumerates the divisors of $w$ in a canonical way: For example, if the $p_1, \ldots p_c$ are the prime divisors of $w$, then the $i$th divisor, $0 \le i < 2^c$, should be the product of the primes $p_j$ where the $j$th bit of $i$ is on. For example, the canonical order of the divisors of $70$ is $(1,2,5,10,7,14,35,70)$. Using this ordering, the $i$th and $j$th divisor are coprime if and only if the bitwise AND of $i$ and $j$ is zero! Using these, the running time reduces to $O(cN^2 + Q(cN + 3^c))$. Computing the initial $G$ in $O(2^c N \log N)$ time.Let us return our attention to the computation of the initial $G$. Remember that it is the number of paths whose GCD is equal to $1$. Let's root a tree at some node $r$. Then there are three types of paths in the tree:
Let's try to enumerate each type separately. First, paths where one endpoint is $r$ can be computed quickly using a similar traversal as above, so it should take $O(cN)$ time. Next, paths whose endpoints belong to the same child of $r$ can be computed by recursively doing this on every child of $r$. So all that remains are paths whose endpoints belong to two different children of $r$. Each such path $(S,T)$ consists of two parts: a path from $S$ to $r$ and a path from $r$ to $T$, so we can precompute all GCDs from $r$ to all the other nodes in $O(cN)$ time as above. Specifically, for each node $i$, we associate a number $g_i$, which is the GCD of the path from node $i$ to $r$. And then we group the numbers from the same child of $r$ together. The problem now becomes: find the number of coprime pairs $(g_i,g_j)$ that belong to different groups! Suppose we have a method to compute the number of coprime pairs $(g_i,g_j)$, regardless of whether they are in different groups or not. Then we can use this method multiple times to compute the number of such coprime pairs in different groups. To do so, first compute the number of coprime pairs $(g_i,g_j)$ regardless of group (by using the method), and then subtract all the coprime pairs that belong to the same group (by using the method on every group). Thus, what remains is to describe the method itself. An inclusionexclusion argument can now be used:
Thus, we need to be able to compute, for each number $d$, the number of $g_i$s divisible by $d$. This can be done by factorizing each $g_i$ and placing them into buckets corresponding to each of its divisors. Since each number has at most $2^c$ divisors, this should take only $O(2^c N)$ time. However, this still might take quadratic time due to recursion in case the tree is degenerate. The trick here is to select the root of the tree as the node that minimizes the size of the maximum child! Such a node can be found in $O(N)$ time by traversing the tree and keeping track of the sizes of each neighboring subtree, and when such a node is selected as root, it can be shown that the size of the children is at most half the size of the original tree! Therefore, the recursion is only at most $\log N$ levels deep, and the overall algorithm takes $O(2^c N \log N)$ time! Finally, one can sometimes speed up this step if one can find find an edge that splits the tree roughly evenly (say, in a ratio between $1/3$ and $2/3$). In this case, there are only two types of paths: those that are on one side of the edge, and those that pass through the edge. The former can be computed recursively, while the latter has already been described in the previous section and runs in $O(cN + 3^c)$ time. This is usually faster than an $O(2^c N)$ step, but unfortunately we can't guarantee the existence of such an edge (for example, consider a star graph), so sometimes one doesn't have a choice but to revert to the $O(2^c N)$time step above. Finally, if all these are implemented well, one might be able to squeeze their program into the time limit! However, the best implementation of the editorialist takes 1.92 seconds in the worst case, so this is probably too close for comfort. Instead, we will describe an elegant solution below. Inclusionexclusion, againTo obtain a good fast solution, we will expand on the inclusionexclusion argument above on the whole tree. In general, this technique is usually applicable whenever you encounter a counting problem involving gcd's. Let $F(g)$ be the number unique paths whose weights are divisible by $g$. Then one can express $G$ in terms of the $F(g)$ as follows (proceeding similarly as above):
Thus, we will need to compute the $F(g)$ for all $1 \le g \le L$, and then afterwards $G$ can be computed quickly! We will describe that below, but first we mention that the above can be written more concisely as the following: $$G = \sum_{g\ge 1} \mu(g) F(g)$$ where $\mu(g)$ is the Möbius function. The $\mu$s can be computed in $O(L \log \log L)$ time again using a sieve, so all that remains really is to compute all the $F(g)$s. Computing $F(g)$For a fixed $g$, one can easily compute $F(g)$ by means of a unionfind data structure. First, consider all edges whose weights are divisible by $g$, and consider the graph $X_g$ that contains only these edges. Then $F(g)$ is simply the number of pairs of nodes in $X_g$ that belong to the same connected component in this graph! Thus, if one uses a unionfind data structure to keep track of the sizes of all connected components, then one can compute $F(g)$ easily. However, remember that there are a total of $L$ $F(g)$s to compute, and there are $N$ nodes, so implementing this naïvely will take $O(LN\alpha(N))$ time, so it's not acceptable. The trick here is that the number of divisors of the weight of any edge is at most $2^c$, and so each edge will only be considered in at most $2^c$ $X_g$s. Furthermore, connected components of size $1$ are not important for our purposes, therefore the number of relevant nodes for each $X_g$ is at most twice the number of edges in it. Since there are $O(2^c N)$ edges in total, therefore the number of make set and union operations is at most $O(2^c N)$ time, and it is possible to finish the task in $O(2^c N \alpha(N))$ time! Finally, we still need to be able to update the answer after each query. We will describe that below. Updating the $F(g)$sFirst, note that when we update the weight of an edge, it will possibly be added in some graph $X_g$s and removed in some others. But the unionfind data structure only supports fast insertions, not deletions, so we will need to do something else. The trick is to notice that there are only a few queries, $Q$, so there are only at most $Q$ edges that will be changed. All the others will remain in their respective $X_g$s. Let's call those $Q$ edges (at most) the volatile edges, and the rest tame edges. Thus, the first thing we can do is to preprocess the tame edges with unionfind, since they will not change as we process the queries. This leaves us with graphs $Y_g$, $1 \le g \le L$, which are similar to $X_g$ but involves only the tame edges. The next thing to do is to create a new graph, $C_g$, which has a single node for every connected component of $Y_g$. By definition, $C_g$ will contain no edges, but soon it will have some edges as we process the queries. Each node of $C_g$ must also remember the size of the component it's representing. And now we process the queries. For the $i$th query, we consider all the queries up to the $i$th query to compute the weights of all the volatile edges at that point. After that, we perform unionfind on each of the $C_g$ graphs by processing all the volatile edges. After doing so, we should be able to compute the $F(g)$ after the $i$th query. Finally, to compute the new $G$, we only consider those $F(g)$s that have changed, since there are only at most $O(2^c Q)$ of those! How fast does this run? First, note that there are only at most $2^c\cdot 2Q$ relevant nodes we need from all the $C_g$, i.e. only those that are adjacent to volatile nodes. Therefore, doing unionfind on all of those takes $O(2^c\cdot Q \alpha(N))$ time! Overall, the algorithm just described runs in $O(2^c(N+Q^2)\alpha(N))$ time, and the solution now passes comfortably within the time limit of the problem! Time Complexity:$O(L \log \log L + 2^c\cdot(N + Q^2)\alpha(N))$, where:
AUTHOR'S AND TESTER'S SOLUTIONS:
This question is marked "community wiki".
asked 15 Mar '15, 09:37
nice question is was... :D
(16 Mar '15, 15:40)
skbly7 ♦♦2★

I used centroid decomposition along with mobius inversion to calculate the initial tree's total paths. And then queries in the way said in this tutorial i.e. find gcd=1 paths that include the changed edge subtract it from last answer and add new gcd=1 paths that include the changed edge's new value. answered 17 Mar '15, 14:21

Setter solution can't be downloaded. Please, fix this issue! answered 19 Mar '15, 03:32
