DIFVAL - Editorial

rsrikkanth · June 15, 2020, 3:32pm

I first had the idea explained here, but I could not get it to pass within the time limit, even with O3 optimization. Maybe I’m very bad at constant optimization :(, I even read input character by character to reduce IO time. Also I find it hard to believe that an \mathcal{O}(n \lg^2 n + q \lg n) solution passes, especially since there are 5 test cases with n = 2 . 10^5 each (I checked with assert).

My \mathcal{O}((n + q)\lg n) solution :

Firstly re-number the vertices in dfs order, now every sub-tree is in a continuous range. Specifically sub-tree at X contains nodes (re-numbered) [l, r] where l is the dfs “intime” (number of nodes visited when you enter dfs at node X) and r is dfs “extime” (number of nodes visited by dfs when you leave X for the last time).

The second crucial observation is that given a query X, D, (say depth of X is d), we need to consider all nodes whose depths are between d and d + D and inside the range [l, r] that sub-tree at X represents. We should ignore nodes having depth greater than d+D, however we can leave nodes with depth less than d since they won’t lie in [l, r]. So we shall consider the tree to be empty initially and update node values in bfs order. Once we have inserted all nodes until depth d+D, we are ready for our query. To answer queries online, use persistent structures.

It is really hard to answer "number of distinct values in range [l, r]" in general. But in this case we can use some tricks. Say we have value v at node X. We need to consider v only for the queries concerning the ancestors of v that do not have any other node with value v already inserted before. This ancestor is the one among lca(v_{pred}, v) and lca(v, v_{succ}) with greater depth (where v_{pred} and v_{succ} are predecessor and successor to v in dfs order). We will add 1 to node v and subtract 1 from the ancestor. Now sub-tree sum equals number of distinct values.

We maintain only the versions of the tree after inserting all nodes at depth d for each d.
To answer the query lookup version \min\{d+D, d_{max}\} and calculate range [l, r] sum for that tree.

agron_x · June 15, 2020, 3:34pm

Hi,

I just wanted to know if O(q.log^2n) solution would pass for subtask 2 where the tree is a bamboo. I don’t know why my solution is timing out for subtask 2

https://www.codechef.com/viewsolution/34395961

carre · June 15, 2020, 3:44pm

I struggled to implement this same approach for two days, but I couldn’t get over the idea that it might happen that with every update you need to rebuild the entire persistent tree. What if at each depth d you need to upgrade a large tin node and a low tin node?

carre · June 15, 2020, 3:58pm

The problem was mainly in your Merge function. Also you should avoid to flush the output each time. Check your modified solution and see if you can find the changes

agron_x · June 15, 2020, 4:12pm

Hey thank you so much I just saw the changes (:

carre · June 15, 2020, 4:17pm

did you know that the std library has a merge function? If you have to merge to sorted vectors v1 and v2 and output that in vector v just write this

vector<int> v(v1.size()+v2.size());
std::merge(v1.begin(),v1.end(),v2.begin(),v2.end(),v.begin());

rajarshi_basu · June 15, 2020, 4:33pm

Nop

rajarshi_basu · June 15, 2020, 4:41pm

Thanks for the suggestion. From next time, I’ll include the straightforward version as well at the beginning.
I just write it in this format because often I have felt that the “how to think/motivation” part is as important as the solution. But what you said also makes sense, since experienced coders often might not want all those details .

lakshayv1 · June 15, 2020, 5:30pm

Hi, I went for subtask-2 in this problem. I tried to do it with a segment tree of sets but it gave TLE ( CodeChef: Practical coding for everyone )
I also tried with a segment tree of unordered-set but it also gave TLE (CodeChef: Practical coding for everyone)
Please tell me why it is better to use a segment tree of vectors than sets.
@rajarshi_basu @rahulmysuru7 Can you please have a look at my solution.

aman1776 · June 15, 2020, 5:34pm

Its too long to read,i guess point to point ,direct solution(traditional approach) would make it better.
Moreover we are going to have video editorials ,there we can discuss our first thought process and shortcomings.

l_returns · June 15, 2020, 6:23pm

I had googled this but couldn’t find it, Thanks for sharing

rajarshi_basu · June 15, 2020, 6:27pm

Yep, I realised this might be boring for experienced coders. I will include a more direct solution from next time as well.

wickedknight · June 15, 2020, 6:30pm

in online queries, the queries are known all at once
on the other hand, in offline queries, they are stored in buffer area or something (not read by online judge/grader) and each of them has to be answered to go to the next step

i read about this in geeksforgeeks site sometime back, not too sure

catastrophic25 · June 15, 2020, 7:04pm

Hello, can anyone please provide the links for where to read the topic small to large merging as mentioned in the prerequisites?

rahulmysuru7 · June 15, 2020, 7:21pm

#define endl “\n” did the work for me for subtask 2 .It went from tle to 1.65

afaxnraner · June 15, 2020, 8:00pm

exactly the other way around

wickedknight · June 15, 2020, 8:02pm

ufff
as i said…i read this sometime back. Sorry for the wrong info

abhinav4047 · June 15, 2020, 8:48pm

Can anyone note if i am making any mistake-
Firstly i thought of making a bfs order for the tree and make the persistent seg tree on this bfs order array i.e i mean if the ith index of pst array is 1 it indicates the last occurence of the value
in the node which has i as bfs order index-
now i am intended to build pst for every node and for each query i will query on the given nodes pst
Then just applied the dsu on trees trick that is firstly recursively build the pst s for small child nodes with keep=0 that is when it will return it will not have any effect of the small child nodes on the pst array then similarly i will call the dfs on the big child with keep=1; then will now go to the small child nodes and make updates due to these nodes on the pst of the bigchild. By update i mean if the current node (which i am using to update) value’s previous occurence is somewhere greater than the bfs order index of current node then i will update the current node’s bfs order index to 1 in the pst and set the prev occurence index value in pst to 0 , similarly if the current value has’nt occured previously then will just update current node bfs order index in pst to 1.
This way after all the updates the current version will be the pst for the processing node.

Now the thing is how will i query for different heights of a given node-
Well for this what i thought of doing is to make a sum query call on the pst of the given node from 0 to number of nodes till height( query node’s height+given query height)'th index

Wont it be correct?
I cant find any mistake with this approach

rsrikkanth · June 16, 2020, 3:25am

We need maximum two updates to add a node (say v). One is "add 1 to v". After this update, only ancestors of v that already have another node with same value as v will be incorrect because they will count this node’s value twice. If such a node exists we need another update. If the closest ancestor to v that has another node with same value is a, "add -1 to a". This will correct all the incorrect ancestors because for each we subtracted 1, all other subtrees are unaffected.

rsrikkanth · June 16, 2020, 3:41am

I prefer these kinds of editorials, I could easily follow the idea and I don’t think it can be any more concise and be so. Also if I hadn’t solved it, I would have taken some of the hints and tried to solve it from there. (Actually in this case my O(n \lg^2 n + q \lg n) didn’t pass so I would have been irritated and not read the editorial at all, but otherwise I would have done that)