SPLIT7 - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: raysh07
Tester: sushil2006
Editorialist: iceknight1093

DIFFICULTY:

Simple

PREREQUISITES:

Binary search

PROBLEM:

For a binary string S, define f(S) to be the length of the longest non-decreasing subsequence of S.
Given a binary string S and a parameter K, compute the maximum possible value of
\min(f(S[1, x_1]), f(S[x_1+1, x_2]), \ldots, f(S[x_{k-1}+1, N]))
across all ways of partitioning S into K substrings.

EXPLANATION:

Very often, when our aim is of the form “maximize the minimum of several values”, it’s instead helpful to ask the question: “for a fixed parameter x, can I make all of these values at least x?”

If this question can be answered quickly, the overall answer would then be the largest value of x for which it’s true - and this can be found using binary search, since the (boolean) answer to the above question is clearly monotonic in x.

So, in our case, we want to answer the following question: is there a way to split the string into K substrings, such that all of them have the lengths of their longest non-decreasing subsequence be at least x?


If such a split exists, it can surely be found greedily: that is, find the smallest index i_1 such that f(S[1, i_1]) \geq x, then find the smallest index i_2 \gt i_1 such that f(S[i_1+1, i_2]) \geq x, and so on till we’ve reached the end of the string.
Note that there might be some leftover suffix which has value \lt x, but this can just be merged with the last chosen substring anyway.

Why is greedy correct?

A simple exchange argument works here.

Suppose we have some split where each substring has LNDS length \geq x.
Let the first substring among them be [1, y], and the second be [y+1, z].
Now, since i_1 was defined to be the smallest index such that f(S[1, i_1]) \geq x, surely we must have y \geq i_1.

But then we could replace [1, y], [y+1, z] with [1, i_1], [i_1+1, z] and still have both segments have LNDS \geq x (note that [y+1, z] is contained inside [i_1+1, z], so the latter’s LNDS can’t be smaller.)

This is why greedily choosing i_1 still works to give an optimal solution.
After choosing i_1, the argument can be repeated for choosing [i_1+1, i_2], and so on.

You might note that the greedy algorithm won’t necessarily give us exactly K substrings: it will however give us the maximum number of substrings that a split can have (where every substring T in the split has f(T) \geq x).

Now, if we obtain \lt K substrings from the greedy algorithm, it’s definitely impossible to obtain a split of size K.
On the other hand, if we obtain \geq K substrings from the greedy algorithm, it’s always possible to obtain exactly K as well: a simple construction is to just merge all the substrings after the K-th one, into the K-th one.
Since appending more characters to a string can’t decrease its longest non-decreasing subsequence, we’ll end up with K substrings all having a value that’s \geq x.


The only thing remaining is: how do we actually compute this maximal split quickly?

For that, let’s look at how we would compute the actual longest non-decreasing subsequence of a binary string.
The algorithm for that is as follows:

  • Let \text{ans} = 0, \text{zeros} = 0.
  • For each i from 1 to N in order,
    • If S_i = 1, increment \text{ans} by 1.
    • If S_i = 0, increment \text{zeros} by 1, and then set \text{ans} = \max(\text{ans}, \text{zeros}).

The idea here is that, as we move from left to right, if we encounter a 1 we can just append it to the best existing non-decreasing subsequence, so the answer always increases by 1.
If we instead encounter a 0, the only new possibility for a non-decreasing subsequence is to take all the existing zeros, so we account for that too by keeping track of their count.

Since this algorithm works from left to right anyway, and always keeps track of the length, it’s not too hard to modify it for our purposes, i.e. to repeatedly find the smallest remaining prefix with LNDS size at least x.
We can do the following:

  • Iterate i from 1 to N, and update the values of \text{ans} and \text{zeros} using S_i, as noted above.
    Also store \text{count}, the number of substrings found so far.
  • If \text{ans} = x, add 1 to \text{count}, and then just reset \text{ans} and \text{zeros} both to 0, since we’re basically starting anew.
  • If \text{count} \geq K in the end we’re good.

For a fixed x the check runs in linear time, so throwing a binary search on top gives \mathcal{O}(N\log N).

TIME COMPLEXITY:

\mathcal{O}(N\log N) per testcase.

CODE:

Editorialist's code (PyPy3)
for _ in range(int(input())):
    n, k = map(int, input().split())
    s = input()
    
    lo, hi = 1, n
    while lo < hi:
        mid = (lo + hi + 1) // 2
        
        cur, ct, zer = 0, 0, 0
        for c in s:
            if c == '1': cur += 1
            else:
                zer += 1
                cur = max(cur, zer)
            
            if cur == mid:
                ct += 1
                cur = zer = 0
        
        if ct < k: hi = mid - 1
        else: lo = mid
    print(lo)
2 Likes

i just want to know can we find longest non-decreasing subsequence of given binary string using given algorithm if problem only ask for finding that only…