PROBLEM LINK:
Author: Utkarsh Lath
Tester: Tom Chen
Editorialist: Ajay K. Verma
DIFFICULTY:
Medium
PREREQUISITES:
Heavy light decomposition, Binary index tree, Segment tree, Caterpillar tree
PROBLEM:
We are given a rooted tree, where each node is assigned a unique number in the beginning (however after some operations on the tree, two nodes may have the same number). The weight of an edge is either 0 or 1 depending on whether the end points of the edge are assigned the same number.
We need to support two kind of queries on this tree: The first kind of query takes a node x and assigns all the nodes in the path from root to x the same number y which was not present in the tree (y can take any value which is not present in the tree). The second kind of query takes a node x and returns the average distance from root to a node in the subtree rooted at x (the distance of a node from root is the sum of weights of the edges in the path from root to the node).
QUICK EXPLANATION:
The average distance from root to a node in the subtree rooted at x is the sum of two values:
 distance of x from root, and
 average distance from x to a node in the subtree rooted at x.
Hence, the second kind of query can be split into two separate queries computing (1) and (2).
All the three queries (update query, and the above two queries) can be answered in O (lg N) time in a balanced tree as well as in a caterpillar tree using a combination of binary index tree and segment tree. Hence, an algorithm based on heavylight decomposition can be used to handle the queries in a general tree in O (lg^{2} N) time.
EXPLANATION:
Let us denote the root of the tree by r. The weight of an edge between two nodes a and b is given by c(a, b).
c(a, b) = 1, if a and b have different numbers
= 0, otherwise.
The distance between two nodes a and b is denoted by d(a, b) which is the sum of the weights of all edges in the path from a to b.
If the nodes in the subtree rooted at x are x_{1}, x_{2}, â€¦, x_{m}, then the average distance from root to a node in the subtree can be calculated as follows:
avg = 1/m * [d(r, x_{1}) + d(r, x_{2}) + â€¦+ d(r, x_{m})]
For each node x_{i}, the path from root to x_{i} must pass through x.
Hence d(r, x_{i}) = d(r, x) + d(x, x_{i}). Using this we can simplify the avg as follows:
avg = 1/m * [(d(r, x) + d(x, x_{1})) + (d(r, x) + d(x, x_{2})) + â€¦+ (d(r, x) + d(x, x_{m}))]
avg = d(r, x) + 1/m * [d(x, x_{1}) + d(x, x_{2}) + â€¦+ d(x, x_{m})]
Hence, the computation of average can be done in two steps: In the first step we compute d(r, x), and in the second step we calculate the average distance between x and a node in the subtree rooted at x.
In a more general version of the problem we can assume that we need to support three kind of queries:
 Update query: Assign all node in the path from root to x a new unique number
 Distance Query: Calculate the distance between root r and x
 Average Query (Sum Query): Calculate the average distance between x and a node in the subtree rooted at x.
Since the update query does not affect the topology of the tree, for each node x we can precompute the size of the subtree rooted at x. This will reduce the third query to compute the sum of distances from node x to the nodes in its subtree. From now onwards, we will assume that the third query corresponds to computing sum of distances.
Let us denote the sum of distances in the subtree rooted at x by S(x)
S(x) = d(x, x_{1}) + d(x, x_{2}) + â€¦ + d(x, x_{m})
Now let us see how many times an edge between nodes x_{i} and x_{j} (if exists) is considered in the above sum. The edge (x_{i}, x_{j}) will come in all the paths from x to x_{k}, where x_{k} lies in the subtree rooted at x_{j}.
Hence, S(x) can be written as
S(x) = \sum sz(b) * c(a, b)
here sz(b) denotes the number of nodes in the subtree rooted at b, and c(a, b) is the weight of the edge (a, b). In an abuse of notation, lets call the quantity sz(b) * c(a, b) as the generalized weight of edge (a, b). The above definition of S(x) is more useful as it allows to compute the S(x) of a node by using the S() of its children. Therefore the S() of all nodes can be computed using a single traversal of the nodes in reverse topological order.
If we define T(x) as
T(x) = S(x) + sz(x) * c(parent(x), x), where x is not root
T(x) = S(x), if x is root
Then, we can see that if the children of x are y1 , y2 â€¦yk, then
T(x) = sz(x) * c(parent(x), x) + T(y1) + T(y2) + â€¦+ T(yk)
It can be seen that the update query changes the S(y) for only those y which lie on the path from root to x.
An O(h) algorithm:
In this section, we discuss an algorithm to answer the above queries in O (h) time, where h is the height of the tree, i.e., the length of the longest path from root to a node in the tree.
At each node x, we store the three values:
 sz(x), number of nodes in the subtree rooted at x,
 weight of the edge (parent(x), x), except when x = r,
 T(x), as defined in the previous section, and
 I(x), where I(x) is a children of x such that weight of edge (x, I(x)) is 0, if no such such child exist, then I(x) = 1
Note that, for any x there can be at most one y such that c(x, y) = 0. Hence, I(x) is well defined. This is because in the update query only nodes along a path are assigned the same number, and each update query assigns a unique number. Since the two children of a node x cannot be the part of a single path from root to some node, both of them cannot have the same number as that of node x.
Now, we will discuss how to handle the queries in O(h) time using the above information.
1) Distance query (Computing d(r, x)):
We just need to follow the path from x to root using parent pointers, and for each node y in the way compute the sum of c(parent(y), y), which will give the distance. Since the maximum length of the path is h, this query can be handled in O (h) time.
2) Subtree distance sum query (Computing S(x)):
We can use the simple formula
S (x) = T(x)  sz(x) * c(parent(x), x).
This will take O(1) time.
3) Update query:
Assume that the path from x to root is chosen.
For simplicity we update the metadata in two passes from x to root, although the same can be done in a single pass.
In the first pass we remove the convert the weight of edges from 0 to 1, and update the values (e.g., T(), I()) caused by this, while in the second pass we convert the weight from 1 to 0, and values.
First pass:
delta := 0 curr := x while (true) { y := I(curr) // Since we are assigning a new number to curr, // the weight of the edge (curr, y) should become 1. // Note that, it is possible that y also lies in the // path from root to x, in which case the edge weight // c(curr, y) still remains zero. However, if this is // the case, the edge weight will be changed to zero // in the second pass. if (y != 1) { T(y) += sz(y) // sz(y) * c(curr, y) delta += sz(y) c(curr, y) = 1 I(curr) = 1 } T(curr) += delta if (curr == r) break curr = parent(curr) }
second pass (almost the dual of first pass):
delta := 0 curr := x prev := 1 while (true) { // The earlier pass ensures that all the edges // encountered in this pass have weight 1, and should // be converted to zero. if (prev != 1) { T(prev) = sz(prev) // sz(prev) * c(curr, prev) delta = size(prev) c(curr, prev) = 0 I(curr) = prev } T(curr) += delta if (curr == r) break prev = curr curr = parent(curr) }
Easy to see that complexity of this operation is O(h)
This means that if we have a balanced tree where h = O (lg N), then we can support all operations in O (lg N) time.
Next, we consider the highly unbalanced tree, caterpillar tree, where height is O (N), and show how to handle the queries in O (lg N) time, and later we will combine the two to give an algorithm for general trees.
An O (lg N) algorithm on a caterpillar tree:
A caterpillar tree is a tree, which has a central path, and each node is within distance 1 from the central path. An example is shown below.
Clearly, the height of a caterpillar tree could be O (N), hence the algorithm of previous section will not be efficient here.
We use a variant of binary index tree to perform the three kind of queries in O (lg N) time in this tree. We use the following data structures to achieve the same.
For each node x which is not part of the central path, we store the same information as in the previous section, i.e.,
 sz(x) = 1,
 weight of the edge (parent(x), x),
 T(x), as defined in the previous section, and
 I(x) = 1
Only information (2) and (3) is nontrivial, and in fact both of them are the same. However, for general trees they will be different.
For the central path, we store the following information:

Segment tree EdgeWeightTree, storing the edge weights of the links in the central path. In the above example this segment tree will corresponds to the array
{e0, e1, e2, e3, e4, e5, e6} 
Segment tree GeneralizedEdgeWeightTree, storing the generalized edge weights of links. The generalized edge weight of an edge (a, b) is c(a,b) * sz(b). In the above example this segment tree will corresponds to the array
{e0 * sz(n0), e1 * sz(n1), â€¦, e6 * sz(n6)} 
Binary index tree NonChainT, storing the T(x)â€™s of the nonchain nodes, i.e., the kth element of the array will correspond to the sum of T() values of all nonchain nodes which are the children of kth node. In the above example this binary index tree corresponds to the array
{0, 0, T0 + T1 + T2, 0, T3, T4 + T5, T6, 0} 
ZeroWtEdges, vector of nonchain nodes in decreasing order of height, which have zero weight edge with their parents. Initially, this list will be empty.
Note that we can combine some of the above information into a single segment tree, but for clarity we store them separately.
A binary index tree takes an array and supports querying subarray sums and updating a single value in the array. The reference shows how this can be implemented. Since we are storing the T() values of nonchain nodes in the binary index tree NonChainT, we can easily update any of them, or computing the sum of T() value of nonchain nodes whose whose parents form a subchain of the central path.
A segment tree takes an array, and along with the queries supported by binary index tree, it also supports setting all values in a subarray to be zero. We will show later how to support this extra kind of query. For the time being let us assume that we have a black box data structure which performs these queries in O (lg N) time.
Now lets use see how to support the three kind of queries on the caterpillar tree using these data structures.
Distance query:
The first segment tree EdgeWeightTree, which stores the edge weights can be used to find the path weight of any subchain in O (lg N) time. If the chosen node x is a chainnode, then we need to return the path weight of the subchain [r, x]. Otherwise, let y be the parent of x, y must be a chainnode. We need to return the path weight of subchain [root, y] + c(y, x).
Subtree distance sum query:
If the chosen node x, is a nonchain node, then we can use the formula
S(x) = T(x)  sz(x) * c(parent(x), x)
to computes its S(x).
Now, assume that the node x lies in the central path. We can compute c(parent(x), x) using the segment tree storing edge weights, hence if we can compute T(x) we are done. T(x) can be computed by iterating over all edges of the subtree rooted at x, and adding their generalized weights together. This means
T(x) = sum of T() of all nonchain nodes whose parents are in the subchain [x, tail] + sum of generalized weights of all links in the subchain [x, tail]
Here tail is the last node of the chain. The first part of the sum can be computed using the binary index tree NonChainT, storing T() values of nonchain nodes, and the second part of the sum can be computed using the segment tree GeneralizedEdgeWeightTree storing generalized weights. Both will take O (lg N) time.
Update query:
Once again this can be done in two steps: first step will convert the edge weights from 0 to 1, and the second step will convert the edge weights from 1 to 0.
First pass:
If the chosen node is a nonchain node, then we move to its parent, which must be a chain node, and run the following snippet on it. On the other hand, if this node is a chain node, we directly run the following snippet on it. Let us say that x is the chosen chain node.
// Convert the weight of edges from chain nodes to nonchain nodes from 0 to 1 // These edges are stored in ZeroWtEdges in decreasing order of height while (!ZeroWtEdges.empty()) { y := ZeroWtEdges.back() z := parent(y) if (height(z) > height[x]) break // Since we are assigning a new number to all nodes in the chain [r, x], // the weight the edge (z, y) is going to be changed to 1. c(z, y) = 1 T(y) += sz(y) NonChainT.increment(z, sz(y)) ZeroWtEdges.pop_back() } // The chain link joining x and its successor in the chain, // this edge weight is also going to changed to 1, if it was zero if (x != tail) { y := chain_succ(x) EdgeWeightTree.set(y, 1) GeneralizedEdgeWeightTree.set(y, sz(y)) }
Since the size of vector ZeroWtEdges can be O (N), a single query make take up to O (N lg N) time. However, the amortized time remains O (lg N). This is because each update query adds at most one element in the vector, hence after k queries there can be at most k elements. Hence, we can perform at most k pop operations on this list.
Second pass:
If x is a nonchain node, then we have a new entry in the vector ZeroWtEdges corresponding to the edge (parent(x), x). Since after the first pass all elements of the vector have height greater than that of x, we can just push back x into the vector. Note that, we also need to update the T(x), and hence the entry in NonChainT.
Next, we need to set the edge weights, and generalized edge weights of all links in the subchain [r, x] to zero, where x is the chain node closest to the one chosen in update query. This can be done by the special operation of our segment tree, which allows setting all values in a subarray to zero in O (lg N) time.
This completes the implementation of the three kind of queries. Now let us see how to implement the segment tree which support our desired operations:
Segment tree supporting subarray nulling:
A segment tree is a balanced binary tree where each node stores information about a subarray. If the subarray corresponding to a node x contains the subarray corresponding to a node y, then y must be a successor of x in the segment tree.
In our case each node stores the sum of the corresponding subarray.
In addition to the sum of the subarray, we also store a new boolean is_null at each node, which is true if all elements in the subarray are zero, and false otherwise. Using this variable we can update the segment tree lazily in case of a subarray nulling operation.
// sets the value at index idx to val (nonzero) // The variable in_null_zone is true, if the underlying tree // corresponds to a subarray whose all elements are zero. void set(tree *t, int idx, int val, bool in_null_zone) { if (!t.contains(idx)) { // We are splitting a null zone into two, hence we need // to pass the null information to children. if (in_null_zone) { t>sum = 0 t>is_null = true } return } // debunking a nullzone as a nonzero value is being put in this subarray. if (t>is_null) { in_null_zone = true t>is_null = false } if (t>is_leaf()) { t>sum = val return } set_val(t>left, idx, val, in_null_zone) set_val(t>right, idx, val, in_null_zone) t>sum = t>left>sum + t>right>sum } // Sets all the elements in the subarray [s, e] to zero void create_null_zone(tree *t, int s, int e) { if (t>is_null) return if (!t>overlaps(s, e)) return // The whole subarray becomes zero. if (t>is_contained(s, e)) { t>sum = 0 t>is_null = true return } // Partial overlap, create null zones in children, // and then merge (if possible) create_null_zone(t>left, s, e) create_null_zone(t>right, s, e) t>sum = t>left>sum + t>right>sum t>is_null = t>left>is_null && t>right>is_null } // Computes the prefix sum of the subarray [0, idx] int prefix_sum(tree *t, int idx) { if (!t>overlaps(0, idx)) return 0 if (t>is_zero) return 0 if (t>is_contained(0, idx)) return t>sum return prefix_sum(t>left, idx) + prefix_sum(t>right, idx) }
General tree:
We have seen that if our tree is balanced, or is a caterpillar tree, then we can handle the three kind of queries in O (lg N) time. These approaches can be combined together in such a way that the queries on the general tree can be answered in O (lg^{2} N) time. This is achieved via heavylight decomposition of the tree.
The main idea in the heavy light decomposition is to partition the tree into disjoint chains such that the path from any node to root passes through at most O (lg N) chains. Since each chain can be visualized as a caterpillar tree, the problem then becomes equivalent of solving the problem on O (lg N) caterpillar tree resulting in O(lg^{2} N) complexity.
As an example in the above figure if we have 4 nonsingleton chains C0, C1, C2, C3 whose corresponding trees are caterpillar trees. If we want to pick a node x in C0, and want to calculate its distance from root r, we need to traverse the chains which lie in the path from x to root, and these are {x}, C0, C1, C3. Next, we consider the part of each chain which lie on the path, and compute its weight by solving the problem on that caterpillar tree, and finally combine them together to get the path weight d(r3, x)
d (x, x) + d(r0, parent(x)) + d(r1, parent(r0)) + d(r3, parent(r1)) +
c(parent(x), x) + c(parent(r0), r0) + c(parent(r1), r1)
Note that all the distance queries are on a single caterpillar tree, and all the edge weight queries are for the edges between root of a caterpillar tree and its parent.
In a similar way, the update query can be split among the chains overlapping the path from root to x. We need to perform the following updates on the caterpillar trees:
C0: as if the path from r0 to x is chosen
C1: as if the path from r1 to r0 is chosen
C3: as if the path from r3 to r1 is chosen
For the distance sum query, we need to consider only the chain where x lies, and use the same methods as that in the case of a caterpillar tree to compute the T(x), and then S(x) using the formula mentioned above.
So now the questions is how do we partition the tree into chains such that no more than O (lg N) chains lie on any path from root to some node. The splitting into chains with such property is called heavylight decomposition, and is described here. For a node x, if one of its children y has more than half of the nodes (i.e., sz(y) > 0.5 * sz(x)), then x and y are in the same chain, otherwise they belongs to different chains. Since there can be at most one such child of a node, the chain structure is well defined. Moreover, each nonchain edge (x, y) means the size of the subtree at x is more that twice the size of subtree at y. Hence in any path there can be at most O (lg N) nonchain edges, i.e., at most O (lg N) chains.
As a side note, sometimes the query time on a caterpillar tree depends on the degree of the caterpillar tree as well (e.g., O (d + lg N), O(d * lg N) etc.). Most often, â€śbinarizingâ€ť the tree as mentioned here helps reducing the time complexity to O (lg N) in such cases. However, in this problem such tricks were not required.
TIME COMPLEXITY:
O (lg^{2} N) per query
RELATED PROBLEM:
In the above problem also support the query which takes a node x and returns the average distance between two nodes in the subtree rooted at x.
AUTHORâ€™S AND TESTERâ€™S SOLUTIONS:
Authorâ€™s solution: admins, please upload.
Testerâ€™s solution: admins, please upload.