PROBLEM LINK:Author: Istvan Nagy DIFFICULTY:EasyMedium PREREQUISITES:Minimum spanning tree, Kruskal's algorithm, unionfind PROBLEM:You are given a weighted undirected tree. Two nodes are randomly chosen to be connected with an edge of $0$ cost. Find the expected cost of the MST of the resulting graph. QUICK EXPLANATION:We want the value $C_{MST}  \frac{S}{T}$ where
$S$ is the only nontrivial thing to compute. Perform Kruskal's algorithm, but for each connected component, also keep track of its size. Then, $S$ is the sum of $\mathrm{size}(a)\cdot \mathrm{size}(b)\cdot c$ for every step $(a,b,c)$ of the Kruskal algorithm. EXPLANATION:Updating the MSTSuppose we want to connect nodes $i$ and $j$ with an edge of $0$ cost. How will the MST be updated? Here's one equivalent characterization of a (finite) tree: A tree is an acyclic graph with exactly $N1$ edges. This gives us some clear requirements on how to turn the graph back into a tree again. Specifically, upon adding the edge $(i,j,0)$:
Together, this means that we must remove one edge from that cycle. Which edge? Well, we want the resulting graph to have the smallest possible cost, so naturally we want to remove the edge with the largest cost! To summarize, by adding the edge $(i,j,0)$, the cost of the MST reduces by exactly the largest edge weight in the path from $i$ to $j$! Expected value of the new MSTTo answer the question, we want to find the expected cost of the new MST. Let $C_{MST}$ be the cost of the original tree. Then from the above, and from the fact that $(i,j)$ is uniformly chosen, we see that this expected value is simply $C_{MST}  \frac{S}{T}$, where
$T$ is pretty easy to compute, (just be careful with overflow!) so all that remains is to compute $S$. Unfortunately, naïve ways of computing this would be too slow, because there are $O(N^2)$ paths! So instead of computing the sum across paths, let's try to compute the sum across all edges, and just figure out how many paths have that edge as the maximumcost edge. In other words, $$S = \sum_{\text{$(a,b,c)$ is an edge}} c\cdot F(a,b,c)$$ where $F(a,b,c)$ is the number of paths whose largestcost edge is $(a,b,c)$. How do we compute $F(a,b,c)$? Let's consider some path $x \leftrightarrow y$. This path has $(a,b,c)$ as its largestcost edge if and only if the following two conditions are satisfied:
(Note that it might be the other way around, i.e. $x$ is connected to $b$ and $y$ is connected to $a$, but then again, paths are symmetric, so $x \leftrightarrow y$ should be considered the same as $y \leftrightarrow x$.) So we find that $F(a,b,c)$ is simply $R(a,c)\cdot R(b,c)$, where $R(x,c)$ is the number of nodes reachable from $x$ with edges of cost $< c$. But how do we compute $R(x,c)$? Amazingly, Kruskal's algorithm can help us here. Remember that Kruskal's algorithm considers the edges in increasing order and unites the nodes into components in that order. This means that, during the step where we process the edge $(a,b,c)$, $R(a,c)$ and $R(b,c)$ are simply the sizes of the components currently containing $a$ and $b$! This gives us the following solution: Perform Kruskal's algorithm, but for each connected component, also keep track of its size. Then, $S$ is the sum of $\mathrm{size}(a)\cdot \mathrm{size}(b)\cdot c$ for every step $(a,b,c)$ of the Kruskal algorithm! In pseudocode:
Time Complexity:$O(N \log N)$ AUTHOR'S AND TESTER'S SOLUTIONS:
This question is marked "community wiki".
asked 20 Nov '16, 21:43

Nice problem with really nice editorial.Learned a new way to approach problems. answered 21 Nov '16, 09:00

I came up with the solution during contest time but was struggling to find how many paths have that edge as the maximumcost edge. I was thinking of precomputing something with DFS so that all queries can be answered quickly but could not come up with so. My bad I could not think that Kruskal algorithm can help here. answered 21 Nov '16, 10:58

This is what I had did in contest time: link text and this is after contest : link text Differences: second last line Moral: Dont use IDE which shows 0.0000000 if u use printf and if u use cout for double , dont forget to write cout<<fixed before setprecision !! answered 21 Nov '16, 08:52

I'm unable to understand the editorial.
link
This answer is marked "community wiki".
answered 22 Nov '16, 18:44

I think setter's code is not meant for this problem.