Setter: Bogdan Ciobanu
Tester: Misha Chorniy
Editorialist: Taranpreet Singh




Probability, Expectation, Dynamic Programming and Combinatorics.


Given a tree with N nodes numbered through 1 to N, each having a color, initially colored white. Find the expected number of steps the following process will last.

In each step, choose an unordered pair (a, b) of vertices at random, if there’s any node colored black on the path between a and b, terminate the process immediately. Otherwise, color all vertices on the path between a and b black and move to the next step.


  • Reduce Expectation to by writing E(X) = P(X \geq 1) + P(X \geq 2) + \dots P(X \geq N), where P(X \geq T) means Probability of performing at least T steps. We can see, that If NumPath is same as Number of ways to choose at least Y paths, then P(X \geq Y) is numPath*K!/((N*(N+1))/2)^K
  • We can maintain a DP table for every node, counting the number of ways to make X paths in the subtree of the node, with another state, telling whether the current path can be connected to paths coming from the parent of current node.
  • To calculate this DP from child to its parent, we need to used another DP table, dp2[X][Y][ST], which counts number of ways to make Y paths using first X nodes such that ST = 0 means current node is not used at all, ST = 1 means the Current node is included in a path which has one end at current node.


So, Time for another combinatorics problem.

First of all, If X is a random variable, Expection is calculated as 1*P(X==1)+2*P(X==2)+3*P(X==3) \dots N*P(X==N). But, we can also rewrite it as

E(X) = P(X \geq 1)+P(X \geq 2) +P(X \geq 3) \dots +P(X \geq N).

Now, see what P(X \geq K) means. It means The number of ways to select at least K non-intersecting paths, in a given order. Now, since Order of selecting path is irrelevant as all paths do not intersect, We can write it as K!*NumWays where NumWays is the number of ways to select at least K paths. Total number of possible paths selected (including intersecting ones) is ((N*(N+1)/2)^K, Hence, P(X \geq K) = NumWays*K!/(N*(N+1)/2)^K.

The problems get reduced to calculate NumWays, and for that, we resort to Dynamic Programming.

First thing, We can have at most sub[u] non-intersecting paths in the subtree of u, where sub[u] is the number of nodes in the subtree of u. The reason is, that suppose, each path cover exactly one node. Even then, the number of Non-intersecting paths is sub[u] and it can be seen, that we can only decrease Number of non-intersecting paths.

Now, we will calculate for each node u, Number of Ways to select X nodes for 1 \leq X \leq SZ[u]. To calculate this, we also need information whether we can connect the current node with its parent in a path.

There are two types of paths in the subtree of node u which include u.

  • Paths coming from the parent cannot be connected to current node/subtree of the current node.
  • Paths coming from the parent can be connected to current node/subtree of current node. Call it Special Path

See, that special type can also be extended toward the parent of u, thus, needs to be handled separately. Also note that the path of the second type may choose to end at the current node, hence, will always be included automatically in paths of type 1.

Now, suppose we have it calculated for every child v of u, the number of ways to select at least k paths in the subtree of v for 1 \leq k \leq sub[v]. We need to compute the same for the current node.

For computing this, we can use another DP for each node, call it dp2[childUsed][NumPaths][0/1/2] which calculate the number of ways we can select NumPaths in children of first childNodes children. The last state represents the number of special paths. We can assume Path of type 1 to be 2 special paths, to simplify our implementation.

We can see, that in no case can a node have move than 2 special paths, since Any path is the shortest distance between two nodes, and such path cannot visit more than 2 subtrees of any node.

Basically, we can fill the whole table using the recurrence.

dp2[child+1][mxPaths+curPaths][st1+st2] += dp2[childUsed][mxPaths][st1]*dp[child][curPaths][st2] if st1+st2 \leq 2

Let us try all combinations of (st1, st2) for understanding.

  • (0,0) This combination counts the number of ways of selecting paths, that child has no special path, and earlier child also do not have any special paths.
  • (0,1) This combination counts the number of ways of selecting paths, that child has a special path, while the earlier child had no special paths
  • (1,0) This combination counts the number of ways of selecting paths, that child has no special path, and earlier child have one special path.
  • (1,1) This combination counts the number of ways of selecting paths, that child has a special path, and earlier child have one special path.
  • (2,0) This combination counts the number of ways of selecting paths, that child has no special path, and earlier child have two special paths.

Combination (2,1) cannot be considered because we cannot have more than 2 special paths coming to a node. Also, For any child, only 1 special path can be extended to its parent. Because if there are two special paths which are extended to parent, it cannot be the part of any shortest path between two vertices.

The reason is, that if two special paths are there, The only shortest path, the current node can be a part of, it the one starting somewhere in the subtree of one child of the current node and ending in a different child’s subtree.

Hence, Only above five combinations need to be considered.

Now, we can see, that by the Fundamental principle of Counting, these are dependent events, and hence, are multiplied.

This way, we will obtain, after computing internal DP, the number of ways to select K paths out of all children of the current node such that the number of special paths up to children level is T (T \in [0,2]).

Now, we need to consider cases as to update the original DP using the values of this internal DP.

The thing is, that we cannot have two special paths extend toward its parent. So, we need to carefully count Number of ways to select paths for each count of special paths which can move forward.

We can see, that we can choose not to extend any path, extend one path to parent as well as make a path at the current node. You need to handle these cases to get Required Number of ways for the current node.

For implementation, please refer the tester’s solution below.

Related Problem

A very recent problem Standard Tree Task from August Cook-Off which uses the precisely same type of Dynamic Programming, which you may refer here.

Time Complexity

Time complexity is O(N^2) per test. Seemingly, Each DFS takes O(N^2) time, but due to Number of nodes being bound to N, Overall Time Complexity is amortized to O(N^2).


Setter’s solution
Tester’s solution
Editorialist’s solution

Feel free to Share your approach, If it differs. Suggestions are always welcomed. :slight_smile:


Could you please add more details on proving time complexity of the solution?

I agree that it is N^2, and I even see how to prove it relatively easily along the lines of “For every pair of vertices, you’ll add 1 to the number of operations performed in their LCA, and there are N^2 pairs of vertices in a tree”, but to me it seems a bit too hard to get there from your one-line explanation.

1 Like

in SUPER QUICK EXPLANATION your E[X] is false.


1 Like

Hello, sorry for the delay, was inactive on discuss.

My proof (not concrete) was actually in terms of outdegree of nodes being bounded by Number of nodes and amortization arising due to that. Didn’t know about proof based on LCA of nodes.

Can you elaborate your proof a bit more? It’ll be helpful.

You can think about any operation done when updating the knapsack in particular node as “Take one vertex from the left, take one vertex from the right”, which can be turned other way around to say that for each pair of vertices you have one additional operation to be performed in the LCA.

Thanksa lot.

I found a similar explanation in book looking for a challenge too.