PROPER -Editorial

costheta_z · January 19, 2021, 12:58pm

PROBLEM SUMMARY:

Contest: Chef Dynasty
Link: Prolonged Perishing
Author and Editorialist: Nandini Kapoor
Tester: Prakhar Kochar (Python), Aman Dwivedi (C++)

Difficulty: Easy-Medium
Prerequisites: Priority Queue/Set implementation

PROBLEM STATEMENT

Chef has a string S of length N consisting of only lower case English alphabets (characters of the string are indexed from 0 to N−1). He does the following in a single operation:

Finds a longest substring of S such that all it’s characters are the same and deletes a single character from this substring (After deleting the i-th character from the string, the part of S to the left of this character i.e. from the beginning of the string to i−1-th character, and that to it’s right i.e. i+1-th character to the end of the string, are concatenated).
Deletes the longest substring of S starting from index 0, such that all it’s characters are the same.

Chef performs this operation multiple times, until the string becomes empty. Find the maximum number of times he can perform this operation before the string vanishes.

Note that if during the i-th operation, the string vanishes after performing only the first step of the operation, the total number of operations performed before the string vanishes is still considered to be i and not i−1.

EXPLANATION:

The choice of the sub-string for step 1 of the operation mentioned in the problem statement determines how long the string would last without vanishing. As we need to find the maximum number of times the operation can be performed on a given string, we are trying to make it last longer.

The 2 cases we would encounter where choice of the longest sub-string, for step 1 of the operation, would be crucial are-

Case 1 - There are more than 1 such sub-strings possessing the maximum length, and one of them starts from index 0. In this case, choosing the sub-string starting from smaller index (i.e. 0) is favorable for step 1 of the operation mentioned in the question. To demonstrate this, consider s=aabba

If you remove one character from the sub-string that starts from the greater index:

operation 1-
- Removing a single ‘b’ from sub-string “bb” s=aaba
- Deleting sub-string starting from 0 such that all it’s characters are same s=ba.
operation 2-
- Removing a single ‘b’ from sub-string “b” s=a
- Deleting sub-string starting from 0 such that all it’s characters are same s=Φ.

or

Removing a single ‘a’ from sub-string “a” s=b
Deleting sub-string starting from 0 such that all it’s characters are same s=Φ.

2 operations were performed before the string emptied.

On the other hand,
If you remove one character from the one that starts from the smaller index:

operation 1-
- Removing a single ‘a’ from sub-string “aa” s=abba
- Deleting sub-string starting from 0 such that all it’s characters are same s=bba.
operation 2-
- Removing a single ‘b’ from longest sub-string “bb” s=ba
- Deleting sub-string starting from 0 such that all it’s characters are same s=a.
operation 3-
- Removing a single ‘a’ from longest sub-string “a” s=Φ.

3 operations were performed before the string emptied.

To generalize this behavior, consider string s with length N having two longest sub-string possessing length x each. One of these sub-strings starts from index 0 with length x and the other from index i with length x.

On performing one operation, if we remove one element from the sub-string starting from index i and then remove the sub-string starting from index 0 consisting of all same characters, we will be left with a string of length N-1-x after completion of the operation.

Whereas if we remove one element from the sub-string starting from index 0 and then remove the sub-string starting from index 0 consisting of all same characters, we will be left with a string of length N-x after completion of the operation.

As N-x-1 < N-x, we end up having a longer string which will potentially endure more number of operations.

Case 2 - Longest sub-string is of length 1. In this case also choosing the sub-string starting from index 0 is favorable, because if you choose characters other than the last character of the string or the first character of the string, you are running the risk of forming a potential prefix made up of same characters, possessing length greater than 1, after this character’s deletion. Such a recreated sub-string migh later on result in deletion of several elements together under step 2 of the operation. To demonstrate that deleting first or last character prevents this, consider s=aba

If you remove one character from the sub-string that does not start from 0 or N-1:

operation 1-
- Removing ‘b’ from sub-string “b” s=aa
- Deleting sub-string starting from 0 such that all it’s characters are same s=Φ.

1 operation was performed before the string emptied.

On the other hand,
If you remove one character from the sub-string that starts from 0:

operation 1-
- Removing ‘a’ from longest sub-string “a” s=ba
- Deleting sub-string starting from 0 such that all it’s characters are same s=a.
operation 2-
- Removing ‘a’ from longest sub-string “a” s=Φ.

or
If you remove one character from the sub-string that starts from N-1:

operation 1-
- Removing ‘a’ from longest sub-string “a” s=ab
- Deleting sub-string starting from 0 such that all it’s characters are same s=b.
operation 2-
- Removing ‘b’ from longest sub-string “b” s=Φ.

2 operations were performed before the string emptied.

Thus we will proceed by choosing the longest sub-string closest to the beginning of the string for step 1 of the operation, so as to obtain maximum number of operations possible.

APPROACH:

To implement the above explanation, we need to arrange the lengths of all sub-strings having all their characters same in a descending order. We also alongside this arrangement need the indices at which sub-strings with their respective arranged lengths start. Since we will be choosing the longest sub-string closest to the beginning of the string, we need these sub-strings to appear before those with same length but greater starting indices.

In simpler words, we need a container to store pairs of (length, starting index) of sub-strings consisting of same characters throughout. And we need this container to arrange these in descending order of lengths, with the condition that when lengths of 2 sub-strings are same, the one with smaller index appears before the one with larger index.

A priority queue of pairs would suit this purpose. And to suffice the ordering of same length strings in ascending order of indices i.e. lower index should have higher priority, we would insert the pairs of (length, -index) rather than (length, index). This would ensure that if index1 > index2, index2 would be placed at higher priority than index1 despite being smaller, as the inserted values in the pair are negatives of the original values and -index1 < -index2.

Once we have this container ready, we traverse this queue, with the purpose of counting the maximum operations possible:

Let i be the front of the string and p be the index corresponding to the string represented by the topmost element in the priority queue.
- If i \geq N, no more operations can be performed. Terminate the process.
- If p<i, pop the element and repeat step 1, otherwise move to step 2.
Pop out the first element from the priority queue, let x be the sub-string it represents.
Replace the last element of x with ‘#’ representing that it is no longer a part of the string.
The length of x now is reduced by 1, if it is greater than 0 after this decrement, reinsert a pair of this new length and the same index into the queue.
Let j=i+1
- if j is a valid index of the string and s[i]=s[j], substitute s[j]=‘#’, increment j and repeat step 4
- otherwise update s[i]=‘#’, update the new front of the string as i=j and move to step 1.

TIME COMPLEXITY:

O(N.log(N).T)

As the sum of N over all test cases does not exceed 2.10^6 ( i.e. N.T \leq 2.10^6 ) ,
The worst case would need O(4.10^7) for N=10^5 theoretically.

TIME ANALYSIS:

Part 1

Traversing the string s of length N, to store the lengths of sub-strings made up of same characters and their corresponding starting indices, takes O(N.log(N)) time. This is because using a priority queue of pairs, or a set of pairs for storing (length, index) pairs would both provide us with O(log(N)) insertion.

Although this can be reduced to O(N) if a priority queue is used and container parameter is provided as N. This is possible because building a heap from all items at once utilizes Floyd’s algorithm with O(N) complexity.
Refer to - building a heap.

Part 2

While performing the operations mentioned in the problem statement-

Looking for the longest sub-string such that all it’s characters are same takes O(1) time as the containers provide constant time lookup of the largest element.
Updating the (length, index) pair after deletion of one element from this sub-string takes O(log(N)) time (for either type of containers).
Since this operation is performed until the string vanishes, it may be performed N times in the worst case.
Let x be the number of characters deleted because of performing step 2 of the operation given in the problem statement throughout the runtime of the code. x \leq N.
Since x elements are deleted, they needn’t be involved in step 1 once deleted, this makes the number of times step 1 is performed to reduce to N-x.
Thus the second part of the code takes O((N-x).log(N)+x) time, which on neglecting smaller order terms leaves us with O(Nlog(N)).

Thus the code altogether would have O(2.Nlog(N)) complexity, which again on neglecting constants gives us O(Nlog(N)) time.

SOLUTIONS:

Setter's Solution

#include<bits/stdc++.h>
using namespace std;

#define _z ios_base::sync_with_stdio(false); cin.tie(NULL);
#define endl "\n"

void solve() {
    string s; 
    cin>>s;
    priority_queue <pair<int, int>> p1;
    int n=s.size(), ans=0, count=1;
    for(int i=1;i<n;i++) {
        if(s[i]==s[i-1]) count++;
        else {
            int index=i-count;
            p1.push({count,-index});
            count=1;
        }
    }
 
    int index=n-count;
    p1.push({count,-index});
 
    index=0;
    bool ok=true;
 
    while(ok) {
        if(index==n) break;
        ans++;
        while((p1.top().second*(-1))<index) p1.pop();
        auto itr=p1.top();
        p1.pop();
 
        int i=itr.second*(-1);
        i+=itr.first-1;
        s[i]='#';
 
        if(itr.first!=1) p1.push({itr.first-1,itr.second});
        while(index<n && s[index]=='#') index++;
 
        if(index!=n){
            char ch=s[index];
            while(index!=n && s[index]==ch){
                s[index]='#';
                index++;
            }
        }
 
        while(index!=n && s[index]=='#') index++;
    }
 
    cout<<ans<<endl;
}
 
int main(){
  _z;
  int t=1;
  cin>>t;
  while(t--) solve();
}

Tester's Solution

import heapq
t=int(input())
for _ in range(0,t):
        str= list(input())
        ch=str[len(str)-1]
        if str[len(str)-1]==ch:
            str.pop()
        empty = ['-']*len(str)
        l = 1;index=0
        pq = []
        n = len(str)
        for i in range(1,len(str)):
                if(str[i] == str[i-1]):
                        l+=1
                else:
                        index = i-l
                        #pq.put((-1*l,index))
                        heapq.heappush(pq,(-1*l,index))
                        l=1
 
        index = n-l
        heapq.heappush(pq,(-1*l,index))
        index = 0
        ans=0
        while(1):
            if(index == n): break
            ans += 1
            pr = list(heapq.heappop(pq))
            while(pr[1]<index):
                pr = list(heapq.heappop(pq))
            i = pr[1]
            i = i+(-1*pr[0])-1
            str[i] = '-'
            if(-1*pr[0]!=1):
                heapq.heappush(pq,(pr[0]+1,pr[1]))
            while(index<n and str[index] == '-'): index+=1
            if(index != n):
                c = str[index]
                while(index!=n and str[index]==c):
                    str[index] = '-'
                    index += 1
            while(index!=n and str[index]=='-'): index+=1
        print(ans,end="\n")

prabhu4 · May 30, 2021, 9:42pm

Great Explanation !