EDITORIAL - TRDST (Unofficial)

This was my approach for this problem which I reckon may not be the most optimal way of solving it. But it worked nevertheless.

If we have the n distances from a node (including itself), we can calculate the kth largest distance. This number - 1 would be our answer for that node.

Solution:

We fix a root node (say node 1) and calculate the distances of all the nodes from it. Now to find the distances from a child node, we can see that for all the distances in the subtree of the child node, we need to subtract 1 from the distances and for all the distances not in the subtree, we need to add 1 to the distances.

We observe that, if we store in an array the preorder traversal of the tree, each subtree will be a contiguous range in that array. In fact, postorder or inorder traversal would also work, but let’s consider preorder for the sake of this editorial.

Now the task simplifies to this:

  1. Create an array from preorder traversal of the tree and store the start and end array indices for each node of the tree - DFS - O(N)

  2. Calculate the distances of all the nodes from the root node (including itself) - DFS - O(N)

  3. Using DFS, whenever we go from parent node to child node, subtract 1 from the range denoting the subtree of child and add 1 to the remaining values (left range and right range). We will do this using sqrt decomposition on the array. We will discuss more below.

  4. Calculate the Ki th largest value in the complete array. Data structure and algorithm is discussed below.

Data Structure:

We will be working on the array that we have created. Let’s take bucket size b. Now partition the array into buckets, b elements in each bucket.

For each bucket we will be storing 2 values. (1) a sorted list of values in that bucket (C++ vector) (2) lazy value = 0, we’ll discuss it’s use soon.

Let’s look at range update. Exactly like sqrt decomposition, for every node completely in the range, we just add or subtract 1 to the lazy value of that bucket. Otherwise we rebuild the particular bucket, for all the values in the bucket we calculate the new value brute force and sort these values. Note: lazy value is not changed in this case.

Time Complexity - O(N/b + blog(b))

For querying the kth largest in the current array, we do this. Let’s consider for the time being that all lazy values are 0. I’ll discuss a binary search approach to get the kth largest element. Let’s say I define a function lessthanx which will tell you the number of items in the whole array less than x. We can calculate this simply by calling lower_bound on all the buckets. This would be O(N/b * log(b))
Now let’s remove the restriction on the lazy value. We can see that the lazy value doesn’t alter the relative positioning of the values inside a bucket, so while querying lessthanx, we can simple call lower_bound on x-lazy to get our answer.

We can apply binary search on x to find the position (or distance) such that it’s the kth largest element in the whole array.
Time Complexity - O(log(N) * N/b * log(b))

We have to do all these in our main DFS,

  1. update: the subtree range -1 and rest of the ranges +1

  2. query on the current array for the required kth largest element

  3. un-update: the subtree range +1 and rest of the ranges -1

Total time complexity - N * (N/b + blog(b) + log(N) * N/b * log(b))
= O(N * (blog(b) + log(N)(N/b)*log(b)))

We can choose the value of b to minimize the time complexity.

Hope this helps! Suggestions are welcome!

3 Likes

Did anyone else do this with centroid decomposition+persistent seg tree?

Centroid decomposition can be used without any complex data structures. Some sorted lists and binary search will do.

I solved it by making a centroid decomposition of the tree. For each centroid node, we can store a sorted list of distances to nodes from that centroid node. Call this a centroid list. Also, for each node directly connected to the centroid node observe that it is the root of some subtree in that centroid’s subtree. For each of these, store the sorted list of distances from the centroid. Call these subtree lists. Now, for any query node, we can find the number of nodes whose distances are >X for a given X. This is done by finding the log n centroids a query node is a child of, then binary searching for X in the centroid list then removing nodes in the subtree the query node is in by binary searching in the subtree list. Finally, we can binary search X for each node using this approach to find the answer. This gives a complexity of O(n * log^3(n)), and quite good constant factors.

Fractional cascading can br applied to achieve a complexity of O(n log^2(n)). However, this is not required.

I have passed over minor details but this explanation may be sufficiently detailed if one is familiar with centroid decomposition already.

My soln -

Read the problem statement.

Realize that I have solved a similar problem before that was almost same except for the initial binary search.

Code it and after a few implementation mistakes get it AC.

BTW, here’s that problem: https://www.hackerearth.com/challenges/competitive/november-easy-18/problems/

I thought of the exact same solution but didn’t implement due to only one issue: optimal value of bucket size(b). How did you find it?

@aryanc403 That’s a lot of searching :smiley: I’ll have a look at the video, seems interesting. Thanks!

Google Search.

wtf do you google search bro. Plz tell :stuck_out_tongue:

Dont try to dodge the question :stuck_out_tongue:

Please add the word “Unofficial” to the title. People may believe that this is the official editorial just like I did.

2 Likes

It never occurred to me. Thanks!

Even I got this question from quora’s blog of Tanuj Khattar.

@aryanc403 can u explain your approach?

Thanks for sharing this approach! This makes the problem much more approachable.

1 Like

You’re welcome. I concur that this isn’t the most optimal way of solving it but my aim was to encourage people to explain their approach. And fairly enough, I now know what Centroid Decomposition is :smiley:

Well, you can estimate the complexity of your code (along with constants) and minimize that function of b. OR, you can ask the online judge if your bucket size if fine :wink:

“Asking the online judge” is just the perfect way to achieve it. XD

To achieve nlog^2n, you can simply store the distance from a node u to all its ancestors before binary search …

I’m sorry I didn’t understand what you mean by asking the online judge. I see people often refer to the online judge; could you please tell me what it is?

It just means submitting your solution with different values of bucket sizes. One of them will surely be the optimal one. XD