PROBLEM LINK:
Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4
Author: raysh07
Tester: sushil2006
Editorialist: iceknight1093
DIFFICULTY:
TBD
PREREQUISITES:
Binary search
PROBLEM:
You’re given a string S containing the characters 0, 1, ?.
You want to replace every ? with either 0 or 1 to obtain a binary string.
For the binary string S, let f(S) denote the minimum number of swaps (not necessarily adjacent) needed to make S good, meaning all its 1’s form a contiguous block.
Across all ways of obtaining a binary string out of S, find the minimum and maximum possible values of f(S).
EXPLANATION:
An obvious subproblem to start off with is: given a binary string S, how do we compute f(S)?
That turns out to be relatively simple.
Suppose there are K ones in the string. Our goal is to make all K of them appear in a single contiguous block, i.e. among positions i, i+1, \ldots, i+K-1 for some index i.
Observe that if we also fix this index i,
- Every 1 present in [i, i+K-1] already doesn’t need to be moved.
- Every 1 outside this range needs exactly one swap to bring it into the range.
So, with i fixed, the number of moves needed equals K - C_i, where C_i denotes the number of ones already present among indices i, \ldots, i+K-1.
f(S) is then obtained by minimizing this quantity, which is equivalent to maximizing C_i.
Thus, quite simply, f(S) can be obtained by just finding out which window of size K has the highest number of ones present in it already.
(Equivalently, the minimum number of zeros present in a window of size K.)
Now, let’s move on to the problem at hand.
First, we deal with minimizing f(S).
It turns out that a somewhat simple strategy works here:
- Delete all occurrences of ? from S and concatenate the remaining parts.
For example, the string 01?0??1?101? becomes 0101101
Let this reduced string be denoted R. - Compute f(R) as we saw above.
- This is the answer!
Proof
Let S' denote a binary string obtained by replacing the ? in S with 0/1.
First, let’s show that there exists some S' such that f(S') \le f(R).
Consider f(R) itself.
There is some substring [x, y] in R such that it’s optimal to move all the 1’s to this substring using swaps.
This substring in R will map back to some subsequence of S.
Let i denote the leftmost index of the subsequence and j denote the rightmost index.Now we obtain a string S' from S as follows:
- For each ? present in the range [i, j], replace it with 1.
- For each ? present outside the range [i, j], replace it with 0.
It’s easy to verify that f(R) \ge f(S') for this string, because [i, j] in particular has the same cost as f(R) (recall that the only new occurrences of 1 are those within [i, j], so the number of ones equals exactly the length of [x, y] plus the number of ? in [i, j], which comes out to be the length of [i, j].)
We have an upper bound on the answer, next we need to show that it is also a lower bound, i.e. f(S') \ge f(R) for every possible completed string S'.
Let R' be a binary string obtained by inserting a single character (either 0 or 1) into R at any position.
We’ll show f(R') \ge f(R).Recall that the “cost” of a substring was the number of zeros in it, and f(R) equals the minimum cost of a substring whose length equals the number of ones in R.
Suppose a 0 is inserted to form R'.
Then, if K denotes the number of ones in R, and we look at a substring of length K in R',
- If the substring doesn’t contain the newly inserted 0, it must’ve also existed in R, and its cost remains the same.
- If the substring does contain the newly inserted 0, then it corresponds to taking a substring of length K in R, removing either the first or the last character, and then including an extra 0.
Clearly, because a 0 was inserted, this means the cost of the substring did not decrease - it either increased by 1 or remained the same.So, every substring’s cost either remained the same or increased.
Thus, f(R') \ge f(R) in this case.Similarly, if a 1 is inserted, it can again be shown that f(R') \ge f(R).
The logic is basically the same, just the windows to be considered are longer by 1 now.Thus, in any case, f(R') \ge f(R).
This tells us that inserting a new character into R cannot reduce its answer.
However, any string S' can be formed by inserting several characters into R.
Thus, we have f(S') \ge f(R) for all possible strings S', as claimed.
Next, consider maximizing f(S).
Unfortunately, there’s no simple strategy that works here like the previous case.
Instead, we will try all possibilities.
Specifically, suppose we fix the value of K, the number of ones present in the final string.
Let’s also fix M to be our desired answer.
Then, we would like to check if there exists a way to assign values in such a way that every window of length K has at least M zeros in it - if we are able to do this, then the answer is certainly \ge M, otherwise it must be \lt M.
This check can be done greedily by processing windows in order from left to right, as follows.
- First, look at the window [1, K]. There are two possibilities: it can either already have \ge M zeros, or it might not.
- If it already has \ge M zeros, we’re happy and nothing needs to be done.
- If it has \lt M zeros, then we need to use some of the ? in the window to make it reach M zeros.
Clearly, it’s optimal to turn the rightmost occurrences of ? into 0’s first, since that will help out with future windows too.
So, we repeatedly choose the rightmost remaining ? in the window and turn it into 0, till either we reach M zeros (and can stop), or run out of ? (in which case it’s impossible.)
- Once this is done, we look at the window [2, K+1], where the exact same logic applies: if we have \lt M zeros, turn some of the rightmost occurrences of ? in the window into 0.
- Apply the same logic to windows [3, K+2], [4, K+3], \ldots, [N-K+1, N].
To find the rightmost available ? to turn into a 0, we can keep a stack of all available ? positions and repeatedly pop its rightmost element.
Even if all the windows can be satisfied, there is one final condition to take care of: we must ensure that the final string contains no more than N-K zeros after replacements; because we started under the assumption that there are K ones.
(Note that it’s fine if we end up with less than N-K zeros, since we can freely change any remaining ? to 0 to reach N-K if needed. Of course, if there aren’t enough occurrences of ? to do this then the current K is invalid and must not be considered in the first place.)
With this, we have a check in \mathcal{O}(N) for fixed values of K and M.
To finish, note that after fixing K, we only care about the highest valid value of M.
Further, if M is valid under our check, M-1 will also be valid.
This allows us to use binary search to find the highest valid M, rather than needing to try all of them.
So, for a fixed K, we check \mathcal{O}(\log N) values of M, each in \mathcal{O}(N) time.
This leads to \mathcal{O}(N\log N) per K, and so \mathcal{O}(N^2 \log N) overall, which is fine.
TIME COMPLEXITY:
\mathcal{O}(N^2 \log N) per testcase.
CODE:
Editorialist's code (C++)
// #include <bits/allocator.h>
// #pragma GCC optimize("O3,unroll-loops")
// #pragma GCC target("avx2,bmi,bmi2,lzcnt,popcnt")
#include "bits/stdc++.h"
using namespace std;
using ll = long long int;
mt19937_64 RNG(chrono::high_resolution_clock::now().time_since_epoch().count());
int main()
{
ios::sync_with_stdio(false); cin.tie(0);
int t; cin >> t;
while (t--) {
int n; cin >> n;
string s; cin >> s;
int ones = 0, empty = 0;
for (int i = 0; i < n; ++i) {
ones += s[i] == '1';
empty += s[i] == '?';
}
int mn = n, mx = 0;
{
string reduced = "";
for (auto c : s) {
if (c != '?') reduced += c;
}
int zeros = 0;
for (int i = 0; i < ones; ++i) {
zeros += reduced[i] == '0';
}
mn = min(mn, zeros);
for (int i = ones; i < n - empty; ++i) {
zeros += reduced[i] == '0';
zeros -= reduced[i-ones] == '0';
mn = min(mn, zeros);
}
}
for (int k = ones; k <= ones+empty; ++k) {
auto check = [&] (int x) {
// can every window of length k have at least x ones?
if (k == 0) {
if (x == 0) return true;
return false;
}
stack<int> st;
st.push(-1);
string cur = s;
int ct = 0, used = n - ones - empty;
for (int i = 0; i < n; ++i) {
if (cur[i] == '?') st.push(i);
ct += cur[i] == '0';
if (i >= k) ct -= cur[i-k] == '0';
if (i < k-1) continue;
while (ct < x) {
int lst = st.top();
if (lst <= i-k) break;
++ct;
cur[lst] = '0';
st.pop();
++used;
}
if (ct < x) return false;
}
return used <= n-k;
};
int lo = 0, hi = k;
while (lo < hi) {
int mid = (lo + hi + 1)/2;
if (check(mid)) lo = mid;
else hi = mid-1;
}
mx = max(mx, lo);
}
cout << mn << ' ' << mx << '\n';
}
}