PROBLEM LINK:Author: Kevin Charles Atienza DIFFICULTY:hard PREREQUISITES:tree dp, merging smaller component with largest component PROBLEM:You are given a graph network consisting of $n$ nodes and $m$ edges. $i$th person uses color $c_i$. For each vertex $i$, you have to answer following quantity  assume If we remove $i$th vertex, how many pairs of vertices with same colors will get disconnected. We call such pairs sad pairs. QUICK EXPLANATION:Solve the problem on a tree EXPLANATION:Solve on a tree Let us root the tree at any vertex (say vertex 1). Let us understand what will be number of sad pairs if we removing vertex 1. Let there be $child$ children of vertex 1. Consider a color $c$, Let $cnt_1$, $cnt_2, \dots cnt_{child}$ denote the number of colors $c$ all the children subtrees of vertex 1. Now number of sad pairs for this color will be sum of $cnt_i * cnt_j$ for $i < j$ childs. Directly iterating over $i, j$ will take $O({child}^2)$ time. Instead, for each $j$, we can find sum of terms multiplied with $cnt_j$ efficiently. Essentially it will be $cnt_1, cnt_2, \dots, cnt_{j  1}$. You can compute this quantity by maintaining a prefix sum. Pseudocode follows.
So, now we know to how to find sad pairs for a given color. Finally total sad pairs will be sum of sad pairs for each distinct color. Let us maintain a Now a we do a dfs from starting node $1$. In the dfs, we will maintain a map as described above. See the pseudo code
Time complexity of this solution will be $O(n^2)$ and similarly space complexity will also e $O(n^2)$. We need to optimize both space and time complexity. Now, we will use a trick of merging smaller subtrees into the largest one at each node. For a very clear understanding of this trick, please see this nice tutorial. If possible, also solve the problems provided in the practice section. For each node, let us call the child whose subtree has largest number of nodes as heavy child (if there are more than one childs, pick any of those). Call the remaining children as lighter children. While computing map for a node, we will merge the maps of lighter children into the map of heavy child. Now, the map corresponding to the current node will be same as that of heavy child.
Let us consider an edge of the tree $(u, v)$  $u$ is parent of $v$. We will call the edge heavy if $v$ is a heavy child of $u$, light otherwise. Let us estimate time and space complexity of this approach. The key, value pair maintained for a color in vertex $v$ is moved into the heavy subtree of parent $u$, if $u$ is light child, i.e. the movement of $(key, value)$ pair is happending across the light edge. Number of light edges in any path from root $1$ to any other vertex can be at max $\mathcal{O}(log$ $ n)$. Hence, it means that each $key$ will be moved at most $log$ $n$ times. Hence, it means that time complexity of this approach will be $\mathcal{O} (n log n) * log n \text{(for operations on map)}$. Note that space complexity of this approach is $\mathcal{O}(n log n)$ too. Please note the very crucial point that when you have merged the ligher subtrees into heavy subtree, then you don't have to create an entirely new map having the content of merged subtree. You can just reuse the memory that you had allocated for heavy subtree. You can implement this by either using references or pointers etc. Please see various solutions for understanding details. Solve the problem on a graph Please note that I have skipped some details on how to compute the sad pairs using the maps created for each subtree. These details should be relatively easier to figure out. Time Complexity:$\mathcal{O}(n + m)$ (for finding blockcut tree) + $\mathcal{O}(n * {log n}^2)$ for solving the problem on created tree. AUTHOR'S AND TESTER'S SOLUTIONS:
This question is marked "community wiki".
asked 15 Jun '16, 11:55

You can solve the question without block cut tree with just DFS in $O(N\sqrt N\log(N))$. Complexity Proof answered 15 Jun '16, 15:50

I have not read block cut tree. But this question can be simply solved by dfs (iterative due to stack over flow) and using John Hopcroft and Robert Tarjan articulation point algorithm's low and depth variable to decide if child node values should contribute to its parent or be deferred to be merged at end after all contributing nodes are merged to its parent. answered 15 Jun '16, 23:30
So basically you do similar Dp as in editorial but on DFStree instead of blockcut tree. right ?
(16 Jun '16, 15:31)
No, it is not a dp. As it does not have overlapping sub problems. A simple dfs is enough.
(17 Jun '16, 01:02)
If you are using map to store frequency count of a node and building it using maps of children, by definition it is dp even if subproblems are not overlapping
(17 Jun '16, 09:11)
Yes you are you are right.
(18 Jun '16, 03:31)

@pb_ds : "virtual tree" is called "auxiliary tree" in this editorial. Here is the code to create an auxiliary tree for a given set of points (from the editorial above):
My approach for the problem : First use tarjan and construct the dfs tree. For every color $c$ find all vertices of color $c$, and create their auxiliary tree. Suppose there are $k$ vertices of color $c$. We are going to consider this color's contribution to every vertex(in the original tree). First calculate subtree sizes. Let's denote this number as $siz[x]$: how many vertices in subtree $x$ in auxiliary tree has color $c$. 1) For a vertex $x$, if it's in the auxiliary tree, it's answer is $\binom{k}{2}\sum_a \binom{s(a)}{2}$, where $a$ is a connected block if $x$ is removed, and $s(a)$ is its size(how many vertices of color $c$ is in $a$). To calculate this value, we need to iterate all children $y$ of $x$. Let $z$ be the vertex in the original tree that is $x$'s child and $y$'s ancestor, if $low[z] \ge dfn[x]$ then minus the answer by $\binom{siz[y]}{2}$. At last let $s_{out}=k\sum_{low[z]\ge dfn[x]}siz[y]$ which is the size of the connected block where root locates, minus the answer by $\binom{s_{out}}{2}$. 2) For a vertex $x$, if it lies on an edge of the auxiliary tree, let's see if removing the edge between $x$ and its parent (in the original tree) makes the graph disconnected. That means $low[x] \ge dfn[parent[x]]$. If it does, remove the edge in the auxiliary tree and it breaks into 2 connected components of size $a$ and $b$. Then the answer is $\binom{a+b}{2}\binom{a}{2}\binom{b}{2}$. 3) For a vertex $x$, if none above, the answer is $0$. Consider how to do 2) efficiently. Let's assign a value for each edge in the original tree. For an edge $(p, c)$($p$ is the parent) in the auxiliary tree calc the value $Value = \binom{k}{2}\binom{siz[c]}{2}\binom{ksiz[c]}{2}$, and add the value of all edges in the original tree $Value$, if this edge fulfills($p'$ is the parent of the edge and $c'$ is the child):
It can be done in linear time, because we only perform add operations and at last query for all edges' values. The overall complexity is $O(m+n\log n)$, because creating auxiliary tree requires a log. answered 17 Jun '16, 06:30

I used Tarjan's algorithm and virtual trees to get $O(n + m) + O(n \log n)$. It runs as quickly as I want. :) The elicitation method of merging runs very quickly too, unlike other $O(\log^2 n)$ algorithms. answered 16 Jun '16, 13:21

Hi I have understood that each key can be copied at most log(n) times in any path.However I haven't understood how that contributes to nlog(n)^2 in the overall time complexity.I think it should rather have been O(n log(n)).(Since there are n distinct keys each copied at most log(n) times).Is there any tutorial which explains this analysis in more detail? answered 02 Apr '17, 13:31

"We call such pairs such pairs". Correct it.. should be sad pairs