BINHAPP - EDITORIAL

PROBLEM LINK:

Contest
Practice

Setters: karnikanay, ashishgup, jeevanjyot
Testers: gamegame
Editorialist: aryanag_adm

DIFFICULTY:

2735

PREREQUISITES:

Kadane’s Algorithm, Dynamic Programming, Segment Tree

PROBLEM:

For a binary string S, \alpha(S) is defined as the maximum number of strings we can partition this string into, such that each string has strictly more 1s than 0s.

f(S) is defined as max_{s \in C} \ \alpha(S), where C is the set of substrings of S.

We are given N strings S_1, S_2, \cdots, S_N

We have to answer Q queries. Each query is of the form L_i \ R_i. Over all permutations \sigma of \{L_i, L_i + 1, \cdots, R_i\}, for query i, we have to output the maximum value of f(S_{\sigma_1} + S_{\sigma_2} + \cdots + S_{\sigma_{R_i - L_i + 1}})

EXPLANATION:

First, given a string S, how would we find \alpha(S)?
It is simply max(0, 1_S - 0_S) where 0_S is the number of 0s in S and 1_S is the number of 1s. The proof of this statement is easy, and is left as an exercise.

Now, given a string S, how would we find f(S)?
By the previous claim, this is equivalent to finding the maximum value of 1_s - 0_s over all substrings s of S. This can be done in linear time by converting each 0 to a -1, and then using Kadane’s algorithm to find the maximum sum of a subarray.

Now, how do we answer a query?
Note that we are taking the maximum value of \alpha over all substrings of all permutations of S_L, S_{L+1}, \cdots, S_R. There are two cases:

  • The maximum value of \alpha is from a substring belonging completely to S_i, for some L \leq i \leq R. To answer this, we can precalculate f(S_i) for all 1 \leq i \leq N, and then find the maximum f(S_i) for L \leq i \leq R using a standard RMQ structure like segment tree.
  • The maximum value of \alpha is from a substring composed of a suffix of some S_i, the entirety of some S, and the prefix of some S_j. Solving this case is slightly more tricky.

First, assume that the suffix and prefix are empty. Then the answer is simply \sum_{i=L}^R \alpha(S_i). This we can easily solve via a standard RSQ structure, like segment tree again.

Now, we instead of considering the entirety of every string, we can select a string and take only some suffix of it. How do we select the right string and the right suffix? If we take a suffix S' of string S, instead of taking the entirety of S, the answer changes by \alpha(S') - \alpha(S) . Therefore, we should add the suffix which maximises \alpha(S') - \alpha(S). For every string S_i, we can find the optimal suffix in linear time - let it be S'_i. We can store B_i = \alpha(S'_i) - \alpha(S_i) at every index, and use some RMQ structure such as segment tree. Now, we can compose answers consisting of complete strings, and the suffix of any 1 string, by taking \sum_{i=L}^{R}\alpha(S_i) and adding max_{L \leq i \leq R} \ B_i.

Finally, we need to consider the case where we can add not only a suffix, but also a prefix. We can use the same logic we used for suffix: we find the optimal prefix S'_i and let C_i = \alpha(S'_i) - \alpha(S_i). Once again by using segment trees, we could form an RMQ structure over C. Finally, we could compose answers consisting of complete strings, the suffix of any 1 string, and the prefix of any 1 string, by taking \sum_{i=L}^{R}\alpha(S_i) + max_{L \leq i \leq R} \ B_i and adding max_{L \leq i \leq R} \ C_i. However, we come across an issue here: the prefix and suffix we take cannot both be from the same string. One way to fix this is by keeping track of not only the largest, but also the second largest B and C in each range in the segment tree, and then if the the largest ones in the range are from the same string, we can use either the second largest B or the second largest C.

This solves the entire problem.

In my solution, I have used a single segment tree for all of these different range queries.

TIME COMPLEXITY:

Time complexity is O((N + Q) \cdot log(N)).

SOLUTION:

Editorialist's Solution
#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
//#include <sys/resource.h>
//#define int long long
#define double long double
#define initrand mt19937 mt_rand(time(0));
#define rand mt_rand()
#define MOD 1000000007
#define INF 1000000000
#define mid(l, u) ((l+u)/2)
#define rchild(i) (i*2 + 2)
#define lchild(i) (i*2 + 1)
#define mp(a, b) make_pair(a, b)
#define lz lazup(l, u, i);
#define ordered_set tree<pair<int, int>, null_type,less<pair<int, int>>, rb_tree_tag,tree_order_statistics_node_update>
using namespace std;
using namespace __gnu_pbds;
const int N = 1e5;
struct node{
    int sum = 0, maxp = -1, smaxp = -1, maxs = -1, smaxs = -1, maxsum = 0;
};
string s[N];
int a[N], b[N], c[N], d[N];//sum, pref, suff, max
node seg[4*N];
node merge(node x, node y){
    node z;
    z.sum = x.sum + y.sum;
    z.maxsum = max(x.maxsum, y.maxsum);
    priority_queue<pair<int, int>> temp;
    temp.push({b[x.maxp], x.maxp});
    temp.push({b[y.maxp], y.maxp});
    if(x.smaxp >= 0) temp.push({b[x.smaxp], x.smaxp});
    if(y.smaxp >= 0) temp.push({b[y.smaxp], y.smaxp});
    pair<int, int> lol = temp.top();
    z.maxp = lol.second;
    temp.pop();
    lol = temp.top();
    z.smaxp = lol.second;

    while(!temp.empty()) temp.pop();
    temp.push({c[x.maxs], x.maxs});
    temp.push({c[y.maxs], y.maxs});
    if(x.smaxp >= 0) temp.push({c[x.smaxs], x.smaxs});
    if(y.smaxp >= 0) temp.push({c[y.smaxs], y.smaxs});
    lol = temp.top();
    z.maxs = lol.second;
    temp.pop();
    lol = temp.top();
    z.smaxs = lol.second;
    return z;
}
int getAns(node x){
    int ans = 0;
    ans = max(ans, x.maxsum);
    int temp = x.sum;
    ans = max(ans, temp);
    if(x.maxp != x.maxs) ans = max(ans, temp + c[x.maxs] + b[x.maxp]);
    else{
        if(x.smaxp != -1) ans = max(ans, temp + b[x.smaxp] + c[x.maxs]);
        if(x.smaxs != -1) ans = max(ans, temp + c[x.smaxs] + b[x.maxp]);
    }
    return ans;
}
void build(int l, int u, int i){
    if(l==u){
        seg[i].sum = a[l];
        seg[i].maxp = l;
        seg[i].smaxp = -1;
        seg[i].maxs = l;
        seg[i].smaxs = -1;
        seg[i].maxsum = d[l];
        return;
    }
    build(l, mid(l, u), lchild(i));
    build(mid(l, u)+1, u, rchild(i));
    seg[i] = merge(seg[lchild(i)], seg[rchild(i)]);
}
node query(int l, int u, int i, int ll, int uu){
    if(l>=ll && u<=uu) return seg[i];
    if(ll>mid(l, u)) return query(mid(l, u)+1, u, rchild(i), ll, uu);
    if(uu<=mid(l, u)) return query(l, mid(l, u), lchild(i), ll, uu);
    return merge(query(l, mid(l, u), lchild(i), ll, uu), query(mid(l, u)+1, u, rchild(i), ll, uu));
}
signed main(){
    int n, q;
    cin>>n>>q;
    vector<int> v[n];
    for(int i = 0;i<n;i++){
        int m;
        cin>>m;
        cin>>s[i];
        a[i] = 0;
        for(int j = 0;j<m;j++){
            if(s[i][j] == '1') v[i].push_back(1);
            else v[i].push_back(-1);
            a[i] += v[i][j];
        }
        a[i] = max(a[i], 0);
        d[i] = c[i] = b[i] = 0;
        int currsum = 0, cp = 0, cs = 0;
        for(int j: v[i]){
            currsum += j;
            d[i] = max(d[i], currsum);
            currsum = max(currsum, 0);
            cp += j;
            b[i] = max(b[i], cp);
        }
        b[i] -= a[i];
        for(int xx = m - 1;xx>=0;xx--){
            cs += v[i][xx];
            c[i] = max(c[i], cs);
        }
        c[i] -= a[i];
    }
    build(0, n-1, 0);
    while(q--){
        int l, r;
        cin>>l>>r;
        cout<<getAns(query(0, n-1, 0, l-1, r-1))<<endl;
    }

}
6 Likes

Just for completion. To prove that \alpha(S) = max(0,1_{S} - 0_{S}), for a string S. Imagine replacing 0 with -1 and 1 with +1. Assume 1_{S} - 0_{S}=K>0. Then you need to partition the string into maximum subarrays such that sum of every subarray is greater than 0. Since the sum of the partitions must be equal to K and every partition can have sum at least equal to 1. An upper bound for the answer is K.
To construct optimal solution, note that the prefix sum of S can only change by +1 or -1. So, there must be a set of indices i_1 , i_2 , i_3i_{K}, such that p[i_1] = 1, p[i_2] = 2, p[i_3] = 3p[i_k] = k. So, optimal partition is: [1..i_1], [i_1+1...i_2], [i_2+1..i_3][i_{k-1}+1..i_k]. Since sum of subarray [i_k+1...n] must be 0. (n is the size of string). So, I can safely extend the partition [i_{k-1}+1..i_k] into [i_{k-1}+1..n] without changing the sum.

Here p denotes the prefix sum array of S.

4 Likes

Well I came up with all the observations but couldn’t implement RSQ. Can this be done using sparse table?

because alpha is min(0,alpha) so none of the values are negative, additionally we’re adjusting the prefix and suffix arrays to only keep the counts that would increase if we used them (pre or suf) instead of the complete string (i.e. pre[i] = max(pre[i] - tot[i],0);) and same for suff[i]

Yeah i just got it that’s why deleted :slight_smile:
Anyways thankss !!

What a brilliant question!
Kudos to the problem setters!.
Just loved how they broke such a complex question into sub-problems and solved it.
I tried thinking of getting an optimal combination of suffix + whole + prefix but was not able to get a clear approach.
Learned a lot from this one.

1 Like