CLOSEFAR - Editorial

animesh_f · July 30, 2016, 10:13pm

PROBLEM LINK:

Practice

Contest

Author: Animesh Fatehpuria

Tester: Pushkar Mishra

Editorialist: Animesh Fatehpuria

DIFFICULTY:

Medium

PREREQUISITES:

Mo’s Algorithm, Segment Trees, Lowest Common Ancestor

PROBLEM:

You are given a Tree with N weighted nodes and you have to deal with two types of queries efficiently. The first query is to report the closest two values in a path between two nodes, and the second query is to report the farthest two values in a path between two nodes. There are total Q queries and N, Q \le 35000.

QUICK EXPLANATION

Linearize the tree by doing a dfs traversal of the tree. Note that we’ll do this a little differently to record the enter and exit time of each node i.e each node will appear twice in the linearized array. Now apply Mo’s Algorithm to sort the queries in a desired order. While expanding or contracting the window, we can check the number of times the node has been seen in this window and correspondingly add or subtract 1 in the particular position of the segment tree. The segment tree just needs to report the maximum difference between any two adjacent on values.

EXPLANATION

Note that for subtask 1 the solution is simple brute force.

For Subtask 2, we can build an lca like data structure precomputing 2^i ancestors of each node, and the maximum/minimum values in the path from each node to this ancestor. Then we can decompose each query (u, v) into u to lca(u, v) and v to lca(u, v) and compute the answer.

It is clear that type F queries can be easily handled.

For Type C queries, let us first look at a brute force solution.

First of all compress all the values stored at the nodes to a range of [1,N].

For each query of the form u,v do a dfs from node u and maintain a boolean array A such that A[i] = 1, if there exists a node having value i on the path from u to current node. On reaching the node v during dfs, the boolean array would represent what all values are present on the path from u to v. Now we can just iterate over the boolean array and for every consecutive i, j such that A[i] and A[j] are both 1, we can keep min of (j - i).

We can fasten up the above solution by maintaining the boolean array A using segment tree. We could use the operations set/unset an element using point update of 1/0 and range max of difference of consecutive set elements. This can easily be done by maintaining the leftmost and rightmost set element in the range of each node.

Now, let’s try to imagine that the queries are on subarrays and not tree-paths. Let us first do a dfs ordering of the given tree (rooted at any arbitrary node) such that each node comes twice in the array, once while entering the dfs and other while exiting the dfs. Now, a path from u to v can be represented as consecutive elements in this array. More on this here.

Hence, if we can implement a function to add a value to a range and a function to remove a value from a range, along with maintaining the answer of the current range, we can easily use MO’s algorithm to solve the problem. To implement the add and remove functions, we could use the segment tree idea described above in the brute force approach. Hence we could use a segment tree along with MO’s algorithm to solve the problem.

Final Complexity: O(Q\sqrt{N}\log{N})

Note : The constraints were such that some brute force solutions passed during the contest, although we had done some testing to ensure that it doesn’t. We had no option but to make the TL very strict, at 3.5 seconds, in order to cut out the brute force solutions. Note that solutions using sets might not pass, as sets and multisets have a large constant factor. We deeply regret any inconvenience caused.

AUTHOR’S SOLUTION

Setter

xellos0 · July 31, 2016, 12:13am

My solution (C queries):

sort the vertices by value, look at differences given by pairs of vertices at distances \le 100 in this array, take the smallest 100; O(100N)
for each query:
- find the LCA
- look at those 100 pairs first, pick the smallest one that’s inside the path (that’s easy to check after the inorder traversal / linearisation); O(100) per query
- if it didn’t find the solution, find all values on the path in O(pathlength), sort them and find the answer as the minimum of adjacent values; O(pathlength\log{pathlength}) per query

This passes in less than 1 second :D. Seriously, after I saw the constraints, I told myself “what, is the intended solution an optimised bruteforce?”.

vsp4 · July 31, 2016, 3:34am

This was my solution during contest: CodeChef: Practical coding for everyone using BST.
I had block size to 180 (almost sqrt 35000), now i saw setters code and changed it to 500 and it passes in practice? CodeChef: Practical coding for everyone Why is this so? I always thought its best to use sqrt n as block size with ((N/M)N + MQ)

Second, i am unable to get it ac with segment tree approach (with either of the block size): CodeChef: Practical coding for everyone Is there any optimization i am missing out?

khatribru · July 31, 2016, 4:00pm

Setters Solution Fail to pass if block size = sqrt(N) .
WhY the setters panel change TL of Problem?

vijayiota77 · July 31, 2016, 5:31pm

sir please check my code ,its not working for 3rd subtask (highly redable code)

{
int c=0;
for(all x in arr)
{

if( x is even)
{
    if(x can be made with only 2's)
     {
        q=q-(x/2);
         c++;
      }
     else if(x can be made with combination of 1 and 2)
     {
          remain=x-2*q;
          if(remain <=p)
           {
             q=0;
             p=p-remain;
             c++;
            }
      }
}
if(x is odd)
{
   int r=(arr[i]-1)/2;
						
if(arr[i]-1<=2*q  && p!=0) //  single one used
  {
c++;
   q=q-r;
    p--;
   }

else
{

       if(arr[i]-(2*q)<=p)//use all q and remaining with use of required p
    {
     c++;
	q=0;
    p=p-(arr[i]-(2*q));
									
}

}
code @
https://www.codechef.com/viewsolution/10980356

krishnkant · July 31, 2016, 10:19pm

Could somebody please help me what optimisations can I do in my solution:

1.)Assume node 1 to be the root of tree.

2.)Apply bfs on the tree and mark parent of each node in an array parent[].

3.)for each Query of u,v find LCA (Lowest Common Ancestor) : if u>v LCA(parent[u],v) , if v>u LCA(parent[v],u) , else u.

4.)get all nodes from u to lca and lca to v (lca inclusive) in an array A.

5.)sort the array A.

6.)if Query is F type answer is A[last]-A[0]. else ans is minimum diference between any two consecutive elements of array A.

also I would be very grateful if I could get a link to understand Mo’s Algorithm (learning it first time),
and its application in this problem .

kishu_invain_1 · July 31, 2016, 10:46pm

I haven’t even tried to solve this problem…
But i’m very excited to ask whether we could save any time by reordering the queries using MO’s algorithm in addition to segment trees??
Segment trees alone couldn’t save us from encountering TLE…??
MO’s algorithm is a must here… ??

pushkarmishra · August 4, 2016, 5:39pm

There has been intense discussion about the testing of this problem, and as the contest admin, I think I need to mention some things. I believe that it is important for the contestants to know what exactly was the issue that the panel faced.

So when we decided to include this question in the problem set, 5 days were left for the contest and we needed a problem of this type, i.e., data structures. We didn’t have many options so we decided to go with this one. But we had this realization at the back of our minds that N\log N*\sqrt N is quite close to N^2 specially when the constants in the former are high (as were in this case). Nevertheless, we went on with this thinking that we will make large data sets and keep the TL high. But then the point came in that we can’t really keep the TL above 8s because that would jam the grader. Hence, we decided to go with TL of 7.5 even when author’s own solution passed in 3.5. This was because author’s solution used segment tree instead of normal sets and that increased the coding by a substantial amount. So we decided that we should keep the limits such that sets/multisets can also pass.

In all this, when I tested the data, I tested it with an N^2 solution which TLEd. I tried modifying that N^2 to do some amount of pruning, etc. but still could not get the solution to pass. I did not test the problem with an intended N\log N*\sqrt N solution (that is why tester solution is not there). However, some amount of time also went in actually coding the same solutions in Java as we did in C++ because we didn’t want people to take advantage of higher TL.

Then in the contest, when @LeBron submitted a solution that was based on heuristics, we panicked. We desperately tried adding cases that could break that solution but he managed to get around it each time. So we finally tried to lower the time limit to 3.5s (which was the bare minimum we could go to since author’s solution would not pass below that). But LeBron and @xellos0 managed to get around that too.

This was the whole issue that we faced regarding the test data. While I believe that whatever testing I did managed to ensure that wrong solutions didn’t pass, but I do accept that my testing could have been more “informative” for the author in terms of TL, types of cases, etc. However, in 2 years of being admin, tester, editorialist, I have realized it is sometimes really tough to get around or predict heuristic approaches. And in fact it wasn’t easy to break the solutions even when we had seen them, so otherwise would have been really tough.

All in all, even though this problem eventually turned out to be fine – it didn’t get as much hacked as we thought we do – I do want to take this opportunity to apologize for not being able to suggest a proper TL to the author via more rigorous testing. I understand that it is really hurtful to see your problem getting hacked, so I must also apologize to the author for skipping some rigorous testing in the face of time scarcity that we had. We had some heated debates over that issue and it is a good time to say sorry, learn from mistakes, and move on.

Thank you.

ahmadreza1994 · August 14, 2016, 7:01pm

how we can get min|Ap-Aq| after create MOS.algo and DFS with segment tree???!! how create segment with lineariz array and how store in segment and query in this question???!

inarito · March 19, 2017, 3:12pm

Link to setter’s solution is broken, please fix.

Thanks.

Edit: here’s a partially commented solution by bluemmb:

CodeChef: Practical coding for everyone

animesh_f · July 31, 2016, 12:18am

Thank you for sharing your solution!

Actually, the brute force solution coded by our tester was taking > 10 seconds. So we had kept a nice relaxed time limit initially. We did not expect brute to pass in 4 seconds.

As far as your solution is concerned, we do agree that our test data was weak and we apologize for it.

animesh_f · July 31, 2016, 3:39am

Actually, there are 2 copies of each node in the linearized array, so there are 2N nodes, which is why I kept it at 500, although theoretically the best block size should be around 350. (doesn’t make a huge difference in my code) Note that the TL is a bit strict, so block sizes might affect your solution if it’s anyway on the borderline. Try making functions inline() in your segment tree approach and submit it.

rajat1603 · July 31, 2016, 5:23pm

the size of the array on which MO’s algo is done is 2*N and not N , so blocksize should be around sqrt(2N) , in practice , higher block size works better in MO’s algo on tree problems.

meteora · July 31, 2016, 7:37pm

@ tester Can you please provide your code also?

ankit_btech · June 28, 2019, 5:46pm

Hey wait! Does this mean that the first two test cases have simply linear trees?