EDGEST - Editorial






Author: Denis Anischenko

Tester: Suchan Park

Editorialist: Suchan Park




Segment tree, Heavy-Light Decomposition


There are two trees T_{1} and T_{2} whose vertices are the same. For each edge e_1 of T_1, count the number of edges e_2 of T_2 such that:

  • If we remove e_1 and add e_2 to T_1, it’s a tree
  • If we remove e_2 and add e_1 to T_2, it’s a tree


If we fix e_1, the problem becomes counting edges e_2 such that both endpoints are at different trees of T_1 - \{e_1\}. This condition can be determined in O(1) by storing preorder visiting time of T_1. If we use Heavy-Light Decomposition, we can think we have to count on 1D arrays. Since e_1 is changing, we use segment trees to help counting edges efficiently. To minimize the number of changes of an array, we use small to large technique. (Maybe this is hard to understand, it’s really hard to summarize the concept)


As the problem statement suggests, let’s fix e_1 = (u, v) and think what e_2 = (a, b) \in T_2 should satisfy in order to be counted:

  • In T_1's perspective: e_2 connects two trees formed by removing e_1
  • In T_2's perspective: e_2 lies on the path connecting two endpoints of e_1

The first condition is equivalent to: exactly one of a and b is in the subtree of v. (From here, we suppose u is a parent node of v in T_1) This can be easily checked by doing a preorder traversal on T_1. By ordering the vertices by this method, the subtree of vertex x can be expressed as an interval \left[ exttt{in}[x], exttt{out}[x]\right] where x is the exttt{in}[x]-th visited vertex. To find whether y is inside the subtree of x, we can just check the condition exttt{in}[x] \le exttt{in}[y] \le exttt{out}[x]. So, if exactly one of a and b satisfies this inequality, this edge can be counted.

Now, it seems natural to combine this with the second condition. For all edges along the path u \rightarrow v, whether each edge satisfies the first condition, and increment the answer variable by 1. This idea could be used for subtask #1, but for subtask #2 this is too slow. It seems we have to use some kind of data structures to improve the time complexity.

If e_1 is fixed, it is also determined whether each vertex is inside the subtree of v or not. For each node x, assign 1 if this is inside the subtree of v, 0 otherwise. Then, let’s take a look at the path from u to v of T_2.

For e_2 = (a, b) to be counted, both 0 and 1 should have been assigned on nodes a and b, where the order doesn’t matter. If we think the path as a string of bits (in this case 10010), the number of counted edges are number of adjacent pairs such that their bits are different.

0s and 1s can vary as e_1 change, so we need a data structure that handles update quickly and counts the number of adjacent different pair of bits. Because paths are hard to be dealt with, let’s use Heavy-Light Decomposition to change each path into O(\log n) intervals. (By the style of your implementation, you might need to separately consider edges connecting endpoints of those intervals) Then, the data structure could effectively do these kinds of operations:

  • Update: Change the i-th bit into b
  • Query: Given an interval [l, r], write all bits from position l to r,and count number of adjacent different pair of bits.

This can be handled by a simple segment tree which stores leftmost bit lb, rightmost bit rb and the answer (number of adjacent different pair of bits) s. The exact formula for updating is:

  • lb = left_child.lb
  • rb = right_child.rb
  • s = left_child.s + right_child.s + (1 if left_child.lb != right.child.rb, 0 otherwise)

Now, the remaining thing is to minimize the number of update queries. Since each bit is assigned to 1 if and only if it is contained in the subtree of node v, it is tempting to fix v in preorder traversal. In pseudocode, we can implement the code like this:

fun solve(v) { // v: current root node. We are going to fix e_1 = (parent_1[v], v)
  for each child w of v {
  for each node x inside subtree of v {
    update_data_structure(x, 1)
  for each path p-q in the heavy-light decomposition of T_2 path parent_1[v] ~ v {
    answer for (parent_1[v], v) += number_of_adjacent_different_bits(p, q)
  for each node x inside subtree of v {
    update_data_structure(x, 0)

But this may update the data structure at most O(N^2) times, which will obviously get TLE. However, one thing we can notice is that, when when we call solve for the the last child of v, in the last loop the algorithm updates 0 for the whole subtree of that node. After it returned, in the second loop, the algorithm updates 1 for the whole subtree of v. Obviously, it seems like a waste of precious time, since we are updating the whole subtree of the last child of v twice, meaninglessly:

So, what we can do is to add one more parameter remove, that denotes whether to update 0 to all nodes in the subtree of v. In general, remove will be true, but in the case when the node is the last-visiting child, remove should be false. Also, it would be best to apply this operation to the child whose subtree size is the largest (let’s call this as the heavy child, all other children are light childs). Now, the code becomes:

fun solve(v, remove) { 
  // v: current root node. We are going to fix e_1 = (parent_1[v], v)
  // remove: true if we will remove all initial updates, false otherwise
  for each light child w of v {
    solve(v, true)
  solve(heavy child of v, false)
  for each node x inside subtree of v, but not inside subtree of heavy child of v {
    update_data_structure(x, 1)
  for each path p-q in the heavy-light decomposition of T_2 path parent_1[v] ~ v {
    answer for (parent_1[v], v) += number_of_adjacent_different_bits(p, q)
  if(remove == true) {
    for each node x inside subtree of v {
      update_data_structure(x, 0)

And actually, it turns out this is enough! Why? If the size of the subtree of $v$ is $n$, the size of the subtree of each light child is at most $n / 2$. After being merged, the total size becomes $n$. So, the size became at least doubled, which means there can be at most $O(\log n)$ moves caused by light nodes, for each node. So the total number of updates is $O(n \log n)$, which gives us $O(n \log^2 n)$ time complexity on updating.

Since querying is also $O(n \log^2 n)$ (there are $O(n)$ edges, and for each edge there are $O(\log n)$ path, and for each path it takes $O(\log n)$ time to compute the answer), the total time complexity is $O(n \log^2 n)$.


Author's solution can be found [here][333].

Tester's solution can be found [here][444].

Editorialist's solution can be found [here][555].

[333]: http://www.codechef.com/download/Solutions/MAY18/setter/EDGEST.cpp
[444]: http://www.codechef.com/download/Solutions/MAY18/tester/EDGEST.cpp
[555]: http://www.codechef.com/download/Solutions/MAY18/editorialist/EDGEST.cpp


Shouldn’t we call solve(w, true) in the second line of fun solve instead of solve(v, true)?


Can someone please provide code with comments. I am finding it difficult to understand since segment tree part.