EMPTYSTR - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: raysh07
Tester: sushil2006
Editorialist: iceknight1093

DIFFICULTY:

TBD

PREREQUISITES:

None

PROBLEM:

You have a string S containing only the characters A and B.
In one move, you can choose an alternating subsequence and delete it.
Find the minimum number of moves needed to make S empty.

EXPLANATION:

Observe that since an alternating subsequence is chosen, each move deletes either an equal number of A’s and B’s, or exactly one more A than B (or vice versa).
Let c_A denote the count of A’s in S, and c_B denote the number of B’s.

Each move changes c_A - c_B by either +1, 0, or -1.
When the string is empty, we’ll have c_A - c_B = 0.
So, we definitely need at least |c_A - c_B| moves to remove every character.

However, this isn’t a strict enough lower bound.
For example, a string like AAAAABBBBB has c_A = c_B = 5 (and so c_A - c_B = 0), and yet it’s not hard to see that we need 5 moves to empty it, since each move can only delete one A and one B.


We observed that the alternating subsequence changed c_A - c_B by either 1, 0, or -1.
This doesn’t just apply to the entire string: it in fact applies to every substring of S.
That is, for every substring, the difference between its count of A’s and B’s changes by at most 1 after each move.

So, let M be the maximum value of |c_A - c_B| across all substrings of S.
We definitely need at least M moves to even remove all the elements of this substring, giving us a better lower bound that the first one we found.

This lower bound is indeed tight - so the answer is exactly M.

Proof

There’s a simple greedy strategy that should come to mind: repeatedly take the longest alternating subsequence possible, over and over again, till S is empty.

As it turns out, this greedy strategy is optimal, and we’ll use it to prove why it uses no more than M steps.

First, we look at what exactly the greedy strategy does.

Let’s break S into maximal contiguous blocks of A’s and B’s. For example, S = \texttt{AABBBABAAA} breaks into \texttt{AA, BBB, A, B, AAA}.
The greedy strategy then simply chooses one element from each such block and deletes it.

Now, let’s look at the substring of S with maximum c_A - c_B value (we presume this substring is non-empty).
Note that such a substring:

  1. Must start with A (if it starts with B, we can drop the starting B and obtain a larger difference).
  2. Must end with A for the same reason.
  3. If the substring is [L, R], then positions L-1 and R+1 won’t contain A (if they did, we could expand the substring to include the A and increase the sum again).

In particular, these three points tell us that the substring is composed of exactly several of the contiguous blocks we initially had; and also the first and last blocks are A’s.
This means the number of A blocks will be exactly one more than the number of B blocks within this substring.

Since we’re picking one character from each block, we’ll pick one more A than B from this substring - meaning its c_A - c_B value will reduce by 1.

So, we’ve ensured that the maximum c_A - c_B value of a substring will reduce by 1 after the move (if it was positive initially).
A similar proof shows that the minimum c_A - c_B value of a substring will increase by 1 if it was negative.

So, both the maximum and minimum will reach 0 within M moves - proving our claim of optimality.


Now, we need to find M quickly enough.

To do that, consider an array a where a_i = 1 if S_i = \text{A} and a_i = -1 otherwise.
The value of c_A - c_B for a substring of S is now simply the sum of the corresponding subarray in a.

We want to find maximum value of |c_A - c_B| across all substrings of S - meaning we want to find the maximum absolute subarray sum across all subarrays of a.
This will be either the maximum subarray sum, or the negative of the minimum subarray sum.

Finding the maximum subarray sum in linear time is a well-known problem - use either prefix sums or Kadane’s algorithm.
The minimum subarray sum is similar, either repurpose the above algorithms to compute minimum or just multiply every element of a by -1 and find the maximum again which might require less code.

TIME COMPLEXITY:

\mathcal{O}(N) per testcase.

CODE:

Author's code (C++)
#include <bits/stdc++.h>
using namespace std;
#define int long long
#define INF (int)1e18

mt19937_64 RNG(chrono::steady_clock::now().time_since_epoch().count());

void Solve() 
{
    int n; cin >> n;
    string s; cin >> s;
    
    int mn = 0, mx = 0, curr = 0;
    for (auto x : s){
        if (x == 'A') curr--;
        else curr++;
        
        mx = max(mx, curr);
        mn = min(mn, curr);
    }

    cout << (mx - mn) << "\n";
}

int32_t main() 
{
    auto begin = std::chrono::high_resolution_clock::now();
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    int t = 1;
    // freopen("in",  "r", stdin);
    // freopen("out", "w", stdout);
    
    cin >> t;
    for(int i = 1; i <= t; i++) 
    {
        //cout << "Case #" << i << ": ";
        Solve();
    }
    auto end = std::chrono::high_resolution_clock::now();
    auto elapsed = std::chrono::duration_cast<std::chrono::nanoseconds>(end - begin);
    cerr << "Time measured: " << elapsed.count() * 1e-9 << " seconds.\n"; 
    return 0;
}
Tester's code (C++)
#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>

using namespace std;
using namespace __gnu_pbds;

template<typename T> using Tree = tree<T, null_type, less<T>, rb_tree_tag, tree_order_statistics_node_update>;
typedef long long int ll;
typedef long double ld;
typedef pair<int,int> pii;
typedef pair<ll,ll> pll;

#define fastio ios_base::sync_with_stdio(false); cin.tie(NULL)
#define pb push_back
#define endl '\n'
#define sz(a) (int)a.size()
#define setbits(x) __builtin_popcountll(x)
#define ff first
#define ss second
#define conts continue
#define ceil2(x,y) ((x+y-1)/(y))
#define all(a) a.begin(), a.end()
#define rall(a) a.rbegin(), a.rend()
#define yes cout << "YES" << endl
#define no cout << "NO" << endl

#define rep(i,n) for(int i = 0; i < n; ++i)
#define rep1(i,n) for(int i = 1; i <= n; ++i)
#define rev(i,s,e) for(int i = s; i >= e; --i)
#define trav(i,a) for(auto &i : a)

template<typename T>
void amin(T &a, T b) {
    a = min(a,b);
}

template<typename T>
void amax(T &a, T b) {
    a = max(a,b);
}

#ifdef LOCAL
#include "debug.h"
#else
#define debug(...) 42
#endif

/*



*/

const int MOD = 1e9 + 7;
const int N = 1e5 + 5;
const int inf1 = int(1e9) + 5;
const ll inf2 = ll(1e18) + 5;

void solve(int test_case){
    ll n; cin >> n;
    string s; cin >> s;
    ll x = 0, y = 0;

    trav(ch,s){
        if(ch == 'A'){
            if(y){
                x++;
                y--;
            }
            else{
                x++;
            }
        }
        else{
            if(x){
                y++;
                x--;
            }
            else{
                y++;
            }
        }
    }

    ll ans = x+y;
    cout << ans << endl;
}

int main()
{
    fastio;

    int t = 1;
    cin >> t;

    rep1(i, t) {
        solve(i);
    }

    return 0;
}
Editorialist's code (PyPy3)
for _ in range(int(input())):
    n = int(input())
    s = input()
    # ans = max absolute subarray sum
    
    ans = 0
    mnp = mxp = pre = 0
    for i in range(n):
        if s[i] == 'A': pre += 1
        else: pre -= 1
        
        ans = max(ans, pre - mnp)
        ans = max(ans, mxp - pre)
        mnp = min(mnp, pre)
        mxp = max(mxp, pre)
    print(ans)
1 Like

This topic was automatically closed after 2 hours. New replies are no longer allowed.

This topic was automatically opened after 2 minutes.

Given the constraints, O(nlogn) will also pass, and deriving from the greedy strategy we can implement it (although the intended solution is much simpler and faster to implement, but reaching the nlogn solution is easier) :

set<ll> a,b;
    f(i,n) 
    {
        if(s[i]=='A') a.insert(i);
        else b.insert(i);
    }

    ll ops=0;
    while((!a.empty()) && (!b.empty()))
    {
        if((*a.begin())<(*b.begin()))
        {
            int val=(*a.begin());
            bool fl=true;
            while(1)
            {
                if(fl==true)
                {
                    auto it = a.lower_bound(val);
                    if(it==a.end()) break;
                    else
                    {
                        val=*it;
                        a.erase(val);
                    }
                    fl=false;
                }
                else
                {
                    auto it = b.lower_bound(val);
                    if(it==b.end()) break;
                    else
                    {
                        val=*it;
                        b.erase(val);
                    }
                    fl=true;
                }
            }
            ops++;
        }
        else swap(a,b);
    }
    ops+=a.size();
    ops+=b.size();
2 Likes

Plain simulation also worked if we only maintain consecutive equal counts
https://www.codechef.com/viewsolution/1110610706
but yes this will be useless for followup.

Even after reading this editorial, its not becoming obvious. Is this line of thought too complex only for me?

I solved it as follows:

Breaking the string into contiguous blocks of same character. Then, I thought that we will choose 1 character from each block until the block with minimum number becomes 0. Once that happens, the block to the left and right of it merge into 1.

There are two ways to implement it. Using a segment tree with following operations:

  1. Find mimimum of entire range
  2. Subtract the minimum from entire range
  3. Merge the left and right of minimum into single block.

I don’t know how to implement this one but seems doable with some effort and lazy propagation in segment tree.

I implemented it using sets and inserted the initial array as {start index, contiguous block length} to maintain order of blocks. Then found the minimum using another set for this and deleted the minimum and found the prev and next of this minimum and merged them into one.

i was trying your solution since my logic was also same but my code was giving TLE so what i did was i copied your code and changed the a.lower_bound and b.lower_bound with normal lower_bound that we use with vector and your code also started to give TLE if you know the reason can you please tell me since i couldn’t find any solution to it