PROBLEM LINK:Author: Ramazan Rakhmatullin DIFFICULTY:Medium PREREQUISITES:Trees, dynamic programming, DFS PROBLEM:The problem can be reformulated as follows: There is given a tree $T$ with $N$ nodes numbered from $0$ to $N1$, where the $i$th node has a color $color(i)$ and level $level(i)$. The goal is to find for each node $v$ the number of nonempty sequences of nodes: $v_0, v_1, \ldots, v_k$ such that:
We call each such sequence a good sequence. EXPLANATION:First of all, in the input, there is given an upper limit on maximum level, but we're going to ignore completely this variable. Subtask 1In the first subtask, the sum of $N$ over all test cases is at most $5000$. Also, there is an upper limit of $20000$ for the sum of numbers in the output, but we won't use it in the below solution. The general idea is to use dynamic programming on the tree. Just for a minute let's forget about efficiency and just defined the most straightforward recursive equation for the tree. Let $dp[v][c][k]$ be the number of good sequences of length $k$, color $c$ in $v$'s subtree. Let's fix node $v$, and let $result[v]$ be the number of good sequences starting in $v$, i.e. the result for $v$, and $count(c)$ be the number of nodes with color $c$.. Then both $result[v]$ and $dp[v][c][k]$ for each $c$ and $k$ can be computed in the following steps:
Ok, but what are the time and memory complexity of this method. Well, there are $N$ nodes, at most $N$ colors and the maximum length of a sequence can be at most $N$, so it looks like the $dp$ table has to have $N^3$ entries. However, for a fixed color $c$, the maximum length of a sequence of color $c$ is at most $count(c)$. This is true because each two consecutive elements of a sequence have to have consecutive levels and the last node has to have level $0$. Thus, the $dp$ array, in fact, has at most $O(N^2)$ entries. If this approach is implemented with a DFS, then each entry $dp[v][c][k]$ is passed only a constant number of times to its parent, and because there are $O(N^2)$ such entries, the total time complexity is $O(N^2)$. Subtask 2The approach for the first subtask given in this editorial should also pass for the second subtask if only rather than passing everywhere $dp[v][c][k]$ for every color $c$, we do it only for colors occurring in the input. Constraints for this subtasks are pretty much similar, so perhaps there exist other slight improvements over the method from the first subtask. Subtask 3In the third subtask, $N$ is at most $10^5$ and sum over $N$ in all test cases is at most $5 \cdot 10^5$. These constraints are large enough to prevent any quadratic time approaches. One idea for a solution passing them, which should be too slow to pass the final subtask, is to create for each color $c$ a compressed tree containing only nodes of color $c$. A compressed tree here means that if there are $count(c)$ nodes of color $c$, then such a tree should contain $O(count(c))$ nodes and for any two nodes $u$ and $v$ of colors $c$ if $u$ is an ancestor of $v$ in the original tree then it's also an ancestor of $v$ in the compressed tree. All such compressed trees can be created using LCA algorithm in $O(N \cdot \log(N))$ time. After that, dynamic programming approaches can be used to find the results separately for each compressed tree. Subtask 4In the last subtask, $N$ can be up to $5 \cdot 10^5$ and sum of $N$ over all test cases is at most $10^6$, which suggests that this subtask requires a really fast solution, especially considering that the time limit for this problem is 1 second. Fortunately, a really beautiful and easy to implement solution exists. The general idea is to improve the complexity of the method described in the solution for the first subtask (which I suggest read first). Let $count(c)$ be the number of nodes of color $c$ in the tree and $result[v]$ be the number of good sequences starting in $v$, i.e. the final result for $v$. Now, let's assume that we run DFS over the tree staring in the root and that $dp[c][k]$ is the number of good sequences of color $c$ starting in nodes with level $k$ found so far in the tree, and we've just entered node $v$. Let's save the current value of $dp[color(v)][level(v)1]$ into variable $s$ or assign $s := 0$ if $level(v) = 0$. Notice that $s$ now represents the value of $dp[color(v)][level(v)1]$ before entering the $v$'s subtree. Next, let's run $DFS$ for all children of $v$ recursively. After all these recursive calls complete, results for all nodes in $v$'s subtree except $v$ are already computed. The only thing left to do for this subtree is to compute the result for $v$. There are a few cases to consider:
The total time complexity of this method is $O(N)$ because $dp$ table contains $O(N)$ entries and while DFS visits a single node it performs a constant number of operations not counting iterating over children of the current node and recursive calls. For implementation details of this exact approach please refer to Marek's solution and Pawel's (mine) solution using the same variable names as the editorial. AUTHOR'S AND TESTER'S SOLUTIONS:
This question is marked "community wiki".
asked 24 Jun '17, 10:58

Even though I wasn't able to solve this problem during the contest  I was really confident that the idea from subtask 3 could solve this problem. With a lot of submissions, small optimizations, I've passed the tests. I think that some tests could TLE with my approach and I challenge anyone with time to find a good testcase :D. What I did: my_solution
I have to admit that the "good" solution is amazing and this post is just to confirm that the idea from subtask 3 could pass the current tests. Can anyone provide a test case that will TLE? Thanks in advance. answered 27 Jun '17, 03:46
Same here, also did merging of the DP in O(n log n) which luckily passed the last case (link). Interested to find an input that hacks this for a TLE as well.
(27 Jun '17, 16:30)

Someone please explain the use of variable s, from pkacprzak's solution, I didn't get that. answered 25 Jun '17, 10:41

@tapanr97 I will try to explain in my words. First of all everything starts from the explanation of the first subtask. The result is obtained by filling up dp[v][c][k] with the appropriate values. This logic is really good but you need as explained in the editorial O(N^2). Now lets think of a node. While you run dfs when you get to a node you need the values for the children of level  1. After you completed a node you can easily see that the values in the dp table for children is not need anymore. Now getting back to your question the solution relies on a way of getting rid of this extra memory. Let's assume that you get to a node where you have that dp[c][k] computed.
This is what I understood. Please correct me if I am wrong. answered 28 Jun '17, 01:41

Can someone please tell me why my solution is giving TLE? I wrote the exact same code as @pkacprzak. answered 25 Jun '17, 06:39
1
In fact, the real reason is that you used std::endl to output new lines. Here is a submission with using explicitly "\n" new line character: https://www.codechef.com/submit/complete/14338613 You can read about the difference between these methods here: https://stackoverflow.com/a/213977/948918 In this problem the output is quite large and time limit very strict, so it's not a good idea to flush the buffer frequently.
(25 Jun '17, 06:54)
1
Thank you for pointing out. I will keep it in mind.
(25 Jun '17, 07:10)

Can any one help me to find why i get TLE my code is here: http://ideone.com/s9OWbY my approach: dfs with inTime, outTime for every node then sort queries by (club, level) then get the sum using segment tree answered 25 Jun '17, 11:55

@chandyshot variable s is used to substract the the number of sequences having color c and level k which have not started with v answered 26 Jun '17, 14:20

can you please explain the role of variable s? Why we have minus it from the dp array to get the result? answered 26 Jun '17, 20:55

I couldn't understand the use of variable s. Why are we subtracting s from result[v]? Can anyone give proper explaination? answered 27 Jun '17, 23:06

If level(v)≥count(color(v))level(v)≥count(color(v)) then result[v]=0result[v]=0 because there are not enough nodes of color color(v)color(v) to form a good sequence starting in v..can anyone please explain this statement. answered 28 Jun '17, 13:26

If level(v) ≥ count(color(v)) then result[v]=0 because there are not enough nodes of color color(v) to form a good sequence starting in v. can anyone please explain thsi statement? answered 28 Jun '17, 13:27

Editorialist's solution is simple and understandable