PROBLEM LINK:
Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4
Author: pols_agyi_pols
Tester: kingmessi
Editorialist: iceknight1093
DIFFICULTY:
Medium
PREREQUISITES:
DFS, dynamic programming
PROBLEM:
You are given a tree with N vertices.
Vertex i has the value V_i.
Construct an array B as follows:
- Let B = [V_1] and S = \{1\} initially.
- Repeat this N-1 times:
- Choose a vertex x such that x \not\in S but there is some vertex y\in S such that the edge (x, y) exists.
- Insert y to S, and append V_y to B.
Find the lexicographically smallest array B that can be obtained this way.
EXPLANATION:
Let’s first understand the given process.
We start at vertex 1, and then repeatedly must choose a vertex that’s connected to a previously chosen vertex, till all N vertices have been chosen.
This is essentially a traversal of the tree - with DFS and BFS being a couple of special cases of the algorithm.
Our aim is now to lexicographically minimize the sequence of values obtained during the traversal.
With this in mind, a somewhat natural greedy algorithm should come to mind, as follows:
- Start at vertex 1.
- The available choices for the second vertex are now all children of 1.
Since we’re aiming to lexicographically minimize the sequence, it’s clearly optimal to choose whichever child has the least value.
Let this chosen vertex be v_1. - Once the choice is made, the available choices for the third vertex are all children of 1 other than v_1, as well as all the children of v_1.
Again, the natural greedy choice is to choose whichever among the available vertices has the least value; then remove the chosen vertex from the available set and instead insert all its children. - This process can be continued until all vertices are visited.
There’s exactly one issue with this greedy algorithm: how do we break ties?
That is, suppose at some point there are two vertices in the active set that both have the same value. We need to be able to decide which one of them to choose.
Since we’re not able to make our choice here, the choice must come from a deeper level. Our task now is to figure out exactly how to make this choice.
To make things a bit cleaner, we’ll change our perspective a bit.
Rather than build the answer top-down from the root, we’ll instead try to build it bottom-up from the leaves.
Specifically, let’s define \text{ans}[u] to be the optimal sequence of values when considering only the subtree of vertex u.
The answer we want is \text{ans}[1].
Suppose the children of u are vertices c_1, c_2, \ldots, c_k.
Observe that for each i, \text{ans}[c_i] will appear as a subsequence of \text{ans}[u].
This is simply because the subsequences corresponding to different children just don’t interact with each other at all in terms of connectivity - so for each child subtree the best we can do is take its lexicographically smallest sequence anyway.
This means \text{ans}[u] can be obtained by merging together \text{ans}[c_1], \text{ans}[c_2], \ldots, \text{ans}[c_k] together somehow (without changing any of their orders), and then inserting V_u, the value of vertex u, right at the start.
Now we need to figure out how to merge the answers for children subtrees.
Let’s take the simplest non-trivial case: there are exactly two children, so we have \text{ans}[c_1] and \text{ans}[c_2], which we want to merge together to obtain the lexicographically smallest array we can.
So, we now have the following subproblem: we have two arrays A and B, which we’d like to merge into a single large array while lexicographically minimizing the result.
It’s not too hard to do this in quadratic time, i.e \mathcal{O}(|A|\cdot |B|).
Solution
Let’s look at A_1 and B_1, the first two elements available to us.
- If A_1 \lt B_1, it’s clearly best to choose A_1.
- If A_1 \gt B_1, it’s clearly best to choose B_1.
- If A_1 = B_1, we need to look further to decide.
In the last case, let’s look at A_2 and B_2 next to decide, since those are the next available elements.
Again, similar casework can be done:
- If A_2 \lt B_2, it’s best to choose A_1 (so that we reach A_2 faster).
- If A_2 \gt B_2, it’s best to choose B_1 (so that we reach B_2 faster).
- If A_2 = B_2, again we need to look further to decide.
In general, let i be the first index such that A_i \ne B_i.
Then, if A_i \lt B_i it’s optimal for the first element to be A_1, otherwise it’s optimal for the first element to be B_1.There’s one edge case, however: and that’s when A is a prefix of B (or vice versa), so that there’s technically no index where they differ.
To deal with this case: if A = B it clearly doesn’t matter whether we start with A_1 or B_1, so suppose A is a proper prefix of B.
It’s then clearly optimal to just pick B_1; since if we ever pick the entire matching prefix at once, it’s clearly better to do so in B just because we’ll have more options afterward, as opposed to taking all of A in which case we have no option left.Now, observe that the above already gives us a complexity of \mathcal{O}((|A| + |B|)\cdot \min(|A|, |B|)).
This is because the combined array has |A| + |B| elements, and for each of them we perform a prefix comparison that runs in \mathcal{O}(\min(|A|, |B|)) time at worst.This is in fact just \mathcal{O}(|A|\cdot |B|), because if |A|\le |B| we have (|A| + |B|)\cdot |A| = |A|^2 + |A|\cdot |B| \le 2\cdot |A|\cdot |B|.
Once we know how to solve this for two arrays, the solution can be extended to many arrays by just merging them two at a time.
If we analyze the time complexity of this, we see that merging k arrays with sizes s_1, s_2, \ldots, s_k takes time
The above complexity appears quite slow - it’s seemingly quadratic in (s_1 + \ldots + s_k), and that’s just to compute the answer for a single vertex.
(For our purposes, the s_i are just the subtree sizes of the children of u).
Applying that to every vertex gives an extra linear factor, which means the overall complexity is \mathcal{O}(N^3)… or is it?
It actually turns out that simply directly implementing this gives us a complexity of only \mathcal{O}(N^2) even when considered across all vertices!
A brief idea of the proof is that if you consider any pair of vertices (u, v), they’ll contribute to the overall summation of costs exactly once: when processing \text{lca}(u, v).
For a more detailed look at why (including images!), have a look at point 7 of this post.
TIME COMPLEXITY:
\mathcal{O}(N^2) per testcase.
CODE:
Editorialist's code (PyPy3)
def merge(a, b):
n, m = len(a), len(b)
ans = []
i, j = 0, 0
while i < n and j < m:
k, done = 0, 0
while i+k < n and j+k < m:
if a[i+k] < b[j+k]:
ans.append(a[i])
i += 1
done = 1
break
if a[i+k] > b[j+k]:
ans.append(b[j])
j += 1
done = 1
break
k += 1
if done == 0:
if i+k == n:
ans.append(b[j])
j += 1
else:
ans.append(a[i])
i += 1
ans.extend(a[i:])
ans.extend(b[j:])
return ans
import sys
sys.setrecursionlimit(10**5)
for _ in range(int(input())):
n = int(input())
a = list(map(int, input().split()))
adj = [ [] for _ in range(n) ]
for i in range(n-1):
u, v = map(int, input().split())
adj[u-1].append(v-1)
adj[v-1].append(u-1)
ans = [ [] for _ in range(n) ]
def dfs(u, par):
for v in adj[u]:
if v == par: continue
dfs(v, u)
ans[u] = merge(ans[u], ans[v])
ans[u] = [a[u]] + ans[u]
dfs(0, 0)
print(*ans[0])