P4235 - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: pols_agyi_pols
Tester: kingmessi
Editorialist: iceknight1093

DIFFICULTY:

Simple

PREREQUISITES:

Sorting

PROBLEM:

You have a string S.
You must perform the following operation once:

  • Partition S into two subsequences (each of which is allowed to be empty.)
  • Then sort each subsequence in-place in ascending order.

Find the lexicographical maximum final string S.

EXPLANATION:

First, observe that while our objective is to find the lexicographically maximum final string, the operation of sorting a subsequence in-place goes against that (as in, it helps toward lexicographic minimization instead.)

In particular, we can prove the following fact:
Let X be some string obtained by performing the given operation.
Then, for any index i (1 \le i \le N) we must have:

X_i \le \max(S_1, S_2, \ldots, S_i)

That is, no matter how we perform our operation, the i-th character of the resulting string cannot exceed the largest character before it.

This is not too hard to see from the operation itself.

Proof

Suppose the split is (A, B), the final string is X, and without loss of generality let i \in A.
Also let M = \max(S_1, \ldots, S_i) be the largest character in this prefix.

We sort the characters at the indices of A in non-decreasing order.
Let the indices of A that don’t exceed i be x_1, x_2, \ldots, x_k.
Then, we know that each S_{x_i} is surely \le M.
So, the subsequence given by A surely has at least k characters that don’t exceed M.
This means even after sorting, the first k characters of A won’t exceed M either.

The same logic applies to the indices of B in [1, i] as well.

But, since both sets of indices don’t exceed M in [1, i], the eventual value of X_i also cannot exceed M, thus proving our claim.


Let’s use this observation to formulate a solution.

We know that X_i \le \max(S_1, \ldots, S_i).
In particular, if S_i is itself equal to \max(S_1, \ldots, S_i) (that is, S_i is the largest character on the prefix of length i), then it’s clearly ideal to have X_i = S_i; because if we didn’t then we’re forced to have a smaller character instead (which is bad for our goal.)

Let’s call an index i good if S_i = \max(S_1, \ldots, S_i) and bad otherwise.
Suppose the good indices are i_1, i_2, \ldots, i_k.

Observe that we definitely have S_{i_1} \le S_{i_2} \le\ldots\le S_{i_k}.
So, if we take all these indices into the same set in the partition (without taking anything in between them), then after sorting none of them will change since they’re already sorted anyway - which is exactly what we want in the first place.

So, we have a natural greedy solution:

  • Choose one set, say A, to be all the good indices in [1, N].
  • Choose B to be all the other indices.
  • Perform the operation with this choice.

This is indeed optimal.

Proof

Let (A, B) be the partition given by our greedy algorithm.
Let (A', B') be any other partition.

Suppose A \ne A', so they differ somewhere.
Let i be the smallest index at which they differ.
There are two possibilities.

First, it’s possible that i \not\in A but i \in A'.
In this case, S_i is not a prefix maximum; and so is smaller than something before it.
Since this was the first difference, the result when we sort is that S_i will move to the left and overwrite something at an earlier index; which can only reduce its value or keep it the same.
So, there’s no benefit to this case.

Next, it’s possible that i \in A but i \not \in A'.
This is also not beneficial; since the greedy algorithm already guarantees that i \in A \implies X_i = \max(S_1, \ldots, S_i) which is the best we can do.
So, removing i from the set cannot improve the prefix ending here anyway.

Thus, in either case we see no improvement in (A', B').
Thus, (A, B) is optimal.

Knowing that the greedy algorithm is optimal, the final solution is quite simple: just find all good indices in S (i.e. the prefix maximums), and then directly perform the operation using them and print the result, which can be done in \mathcal{O}(N\log N) time.

TIME COMPLEXITY:

\mathcal{O}(N\log N) per testcase.

CODE:

Editorialist's code (PyPy3)
for _ in range(int(input())):
    n = int(input())
    s = input()
    
    inds, chars = [], []
    prv = s[0]
    ans = ['']*n
    ans[0] = s[0]
    for i in range(1, n):
        if s[i] >= prv:
            prv = s[i]
            ans[i] = s[i]
        else:
            inds.append(i)
            chars.append(s[i])
    
    chars.sort()
    for i in range(len(inds)):
        ans[inds[i]] = chars[i]
    print(''.join(ans))
1 Like