BREAKSTR - Editorial

PROBLEM LINK:

Practice

DIFFICULTY:

EASY-MEDIUM

PREREQUISITES:

Greedy Prefix Sums

PROBLEM:

Given a binary string S of length N, divide it into contiguous substrings, such that:

  • the absolute difference between the number of 0's and 1's, in each substring, is \le K,
  • the number of substrings is minimised.

EXPLANATION:

Let an optimal partitioning of S (from left to right) be \{s_1,s_2,\dots,s_m\}.
Let the cost of a string be the number of 1's minus the number of 0's (not absolute difference).

Observation: The cost of s_1,s_2,\dots,s_m is all positive or all negative.

Proof

We assume the contrary.
Then there exists some 1\le i < m, such that the parity of the cost’s of s_i and s_{i+1} is different. WLOG, let the costs of the strings be +a and -b respectively. From the definition, we know that 0\le a,b\le K.

Now, we can replace these two strings with a single string s_i+s_{i+1} and the partitioning will still be valid.
This is because the cost of the new string will be a+(-b) and by the above inequality, it is clear that -K\le a-b\le +K.

We have found a better solution (with one less substring), contradicting the optimality of \{s_1,\dots,s_m\}.
Contradiction and we are done.

WLOG, let the cost of all substrings be positive (the negative case can be handled similarly).

Observation: There exists an optimal partitioning such the cost of s_1,s_2,\dots,s_{m-1} is exactly +K.

Proof

It is obvious that cost of s_i + cost of s_{i+1} should be greater than +K (and \le 2*K), else we can merge the two strings to get a better solution.

We prove by recursion. Consider an optimal partitioning \{s_1,\dots,s_m\}.

Now, while there exists some 1\le i< m such that, the cost of s_i is not equal to K, we can remove the first element of s_{i+1} and append it to the end of s_i. This works because:

  1. The partitioning is still valid (s_i and s_{i+1} are still valid, disjoint substrings). Also, when the cost of s_i is exactly K, 0< cost of s_{i+1} \le K - a direct consequence of the first statement.
  2. Every operation changes the cost by exactly 1. Thus, it is possible for us to stop when the cost is exactly +K (this will happen, since moving all elements from s_{i+1} to s_i will exceed the cost constraint).

Since in each move, an element moves to the previous substring and the number of substrings doesn’t increase, the recursion is guaranteed to end. Thus, the entire algorithm can be done recursively to get a partitioning no worse than the original, that matches our above observation.


Thus, we have reduced the problem to partitioning S into substrings \{s_1,s_2,\dots,s_m\} such that, for each 1\le i < m, the cost of s_i is K (or -K for the negative case) and the value of m is minimised.

We solve this greedily. While S is non-empty, we -

  1. take the longest prefix of S whose cost is \le K (or \ge -K for the negative case),
  2. add the substring to our optimal partitioning, and
  3. erase the substring from S.

(The proof of the greedy approach is left to the reader as an exercise.)

With suitable pre-processing, it is possible to solve the problem in O(N).

Implementation

Let F = \max\left(1,\left\lceil \frac{|\text{cost of S}|}{K} \right\rceil \right). It is evident that F is the number of substrings in the optimal partitioning.

Let c[i]\space (1\le i < F) be the largest value x\space (0\le x < N) such that the cost of prefix string S[0..x] is exactly K*i (or -K*i for the negative case). This can be computed using prefix sums in the following fashion:

  • Iterate through the string (from left to right) - let the current index be x.
  • Maintain a variable sum, equal to the cost of the prefix of S ending at x.
  • If the value of sum equals K*i for some integer i, set the value of c[i] to x.

The first substring starts at S[0]. The j^{th}\space(2\le j \le F) substring starts at S[c[j-1]+1].

TIME COMPLEXITY:

(Refer Implementation spoiler for terminologies).

Computing array c takes O(N) time. Generating the substrings takes O(F) time.
The final time complexity is thus:

O(N+F) \approx O(N)

SOLUTIONS:

Editorialist’s solution can be found here.


Experimental: For evaluation purposes, please rate the editorial (1 being poor and 5 excellent)

  • 1
  • 2
  • 3
  • 4
  • 5

0 voters

2 Likes