BRKDRMS - Editorial

PROBLEM LINK:

Contest Division 1
Contest Division 2
Contest Division 3
Contest Division 4

Setter: S.Manuj Nanthan
Tester: Abhinav Sharma, Manan Grover
Editorialist: Lavish Gupta

DIFFICULTY:

Easy

PREREQUISITES:

Familiarity with comparing string lexicographically will be useful.

PROBLEM:

You are given a string S of length N, containing lowercase Latin letters. You are also given an integer K.

You would like to create a new string S' by following the following process:

  • First, partition S into exactly K non-empty subsequences S_1, S_2, \ldots, S_K. Note that every character of S must be present in exactly one of the S_i.
  • Then, create S' as the concatenation of these subsequences, i.e, S' = S_1 + S_2 + \ldots + S_K, where + denotes string concatenation.

Determine the lexicographically smallest S' that can be created.

EXPLANATION:

We can divide the solution in two parts, when K = 2 and when K > 2.

Creating lexicographically smallest string when K > 2

Let \alpha represents the lexicographically smallest character, and \beta represents the second lexicographically smallest character in the string.

We want to choose a subsequence such that the resulting string is lexicographically smallest. To do this, we will greedily choose all the occurrences of \alpha in our first subsequence. Let the last occurrence of this character be ind.

What to choose from the remaining string after choosing the all the lexicographically smallest characters?

We will start choosing all the occurrences of \beta that occurs after the index ind. If \beta also occurs before the index ind, then we are done with this subsequence. Otherwise, we have selected all the occurrences of \beta, and continue with the third lexicographically smallest character.

Once we have chosen all the characters for the first subsequence, we can remove these characters from the string, decrease K by 1, and continue our problem.

Creating lexicographically smallest string when K = 2

We want to choose a subsequence such that the resulting string is lexicographically smallest. To do this, we will first consider the lexicographically smallest character in the string. We will greedily choose all these characters in our first subsequence.

What to choose from the remaining string after choosing the all the lexicographically smallest characters?

Suppose till now, we have chosen the complete prefix of our string as part of first sub-sequence. In that case, we have a remaining smaller string and we can solve it by starting the logic from the beginning.

Otherwise, we have atleast one character that we have not taken in the sub-sequence. The first such character will be at the beginning of the second subsequence. Let us represent this character by \alpha. Now, we consider one of the character \beta of the remaining string.
If that character is lexicographically greater than \alpha, we will not take that character in the first subsequence greedily, so that we are not placing it before \alpha in the final string.
Now, consider the case when \beta is lexicographically smaller than \alpha. If \beta is the lexicographically smallest element in the remaining string, we will take \beta in the first subsequence. Otherwise, we will skip this character in order to take the lexicographically smaller character in the sub-sequence.

Note that the condition subsequences should be non-empty is redundant. If we can find a solution when empty sub-sequences are also allowed, we can convert that solution into non-empty sub-sequences by breaking down our previous sub-sequences.

TIME COMPLEXITY:

We need at most 26 non-empty subsequences to get our answer. So we can implement the above approach in O(N \cdot 26) for each test case.

SOLUTION:

Setter's Solution
#from itertools import *
#from math import *
#from bisect import *
#from collections import *
#from random import *               # ( : ): ( :) : (: ) : (  ): ) : )
#from decimal import *
#from heapq import *
#from itertools import *            # Things Change ....remember :)
import sys
input=sys.stdin.readline
def inp():
    return int(input())
def st():
    return input().rstrip('\n')
def lis():
    return list(map(int,input().split()))
def ma():
    return map(int,input().split())
t=inp()
while(t):
    t-=1
    n,k=ma()
    s=st()
    z=set(list(s))
    if(len(z)<=k):
        print(''.join(sorted(list(s))))
        continue
    if(k==1):
        print(s)
        continue
    freq=[0]*26
    for i in s:
        freq[ord(i)-97]+=1
    ind={chr(i+97):[] for i in range(26)}
    for i in range(n):
        ind[s[i]].append(i)
    res=''
    vis=[0]*n
    flag=0
    total=0
    for i in range(k-1):
        j=-1
        valid=0
        if(total==n):
            break
        while(j<n):
            while(valid<26 and freq[valid]==0):
                valid+=1
            if(valid>=26):
                break
            cur=chr(97+valid)
            if(ind[cur][-1] <j):
                break
            ex=ind[cur][-1]
            while(ind[cur] and ind[cur][-1]>j):
                total+=1
                x=ind[cur].pop()
                vis[x]=1
                res+=cur
                freq[valid]-=1
            j=ex
    left=''
    for i in range(1+j):
        if(vis[i]):
            continue
        left+=s[i]
    right=''
    for i in range(j+1,n):
        if(vis[i]):
            continue
        right+=s[i]
    cur_pos=0
    rpart=[[] for i in range(26)]
    for i in range(len(right)):
        z=ord(right[i])-97
        rpart[z].append(i)
    for i in range(26):
        rpart[i]=rpart[i][::-1]
    prev=-1
    if(left=='' and right==''):
        print(res)
        continue
    while(1):
        minchar=999
        cmpord=ord(left[cur_pos])-97
        #handle case where left[0] < right[j]
        for i in range(26):
            if(len(rpart[i])):
                if(i<=cmpord):
                    minchar=i
                    break
        if(minchar==999):       #case 2 : left[0] < smallest right chars
            res+=left
            res+=right[prev+1:]
            break
        elif(minchar!=cmpord):
            last=rpart[minchar][0]
            ok_char=chr(minchar+97)
            res+=ok_char*(len(rpart[minchar]))
            for i in range(prev+1,last+1):
                if(right[i]!=ok_char):
                    left+=right[i]
            for i in range(26):
                while(rpart[i] and rpart[i][-1]<=last):
                    rpart[i].pop()
            prev=last
        else:
            #case 3 : left[0]==smallest right char
            string1=res+left+right[prev+1:]
            ok_char=chr(minchar+97)
            last=rpart[minchar][0]
            for i in range(prev+1,last+1):
                if(right[i]!=ok_char):
                    left+=right[i]
            ok_char=chr(minchar+97)*len(rpart[minchar])
            string2=res+ok_char+left+right[last+1:]
            res=min(string1,string2)
            break
    print(res)
Tester-1's Solution
#include <bits/stdc++.h>
using namespace std;
int main(){
  ios_base::sync_with_stdio(false);cin.tie(NULL);cout.tie(NULL);
  #ifndef ONLINE_JUDGE
  freopen("input.txt", "r", stdin);
  freopen("output.txt", "w", stdout);
  #endif
  int t;
  cin>>t;
  while(t--){
    int n, k;
    cin>>n>>k;
    string s;
    cin>>s;
    string u = s;
    sort(u.begin(), u.end());
    if(k >= 26){
      cout<<u<<"\n";
    }else{
      int x = 0;
      for(int i = 0; i < k - 2; i++){
        for(int j = 0; j < n; j++){
          if(s[j] == u[x]){
            cout<<s[j];
            s[j] = '#';
            x++;
          }
        }
      }
      if(k > 1){
        int a[n] = {};
        char pr = 'z';
        for(int i = n - 1; i > -1; i--){
          if(s[i] != '#'){
            if(s[i] <= pr){
              a[i] = 1;
              pr = s[i];
            }
          }
        }
        char y = '#', z = '#';
        for(int i = 0; i < n; i++){
          if(s[i] != '#'){
            if(a[i] == 0){
              if(y == '#'){
                y = s[i];
              }else if(s[i] != y){
                z = s[i];
                break;
              }
            }
          }
        }
        if(z != '#' && z < y){
          y--;
        }
        if(y != '#'){
          for(int i = 0; i < n; i++){
            if(a[i]){
              if(s[i] <= y){
                cout<<s[i];
                s[i] = '#';
              }
            }
          }
        }
      }
      for(int i = 0; i < n; i++){
        if(s[i] != '#'){
          cout<<s[i];
        }
      }
      cout<<"\n";
    }
  }
  return 0;
}
Tester-2's Solution
#include <bits/stdc++.h>
using namespace std;
 
 
/*
------------------------Input Checker----------------------------------
*/
 
long long readInt(long long l,long long r,char endd){
    long long x=0;
    int cnt=0;
    int fi=-1;
    bool is_neg=false;
    while(true){
        char g=getchar();
        if(g=='-'){
            assert(fi==-1);
            is_neg=true;
            continue;
        }
        if('0'<=g && g<='9'){
            x*=10;
            x+=g-'0';
            if(cnt==0){
                fi=g-'0';
            }
            cnt++;
            assert(fi!=0 || cnt==1);
            assert(fi!=0 || is_neg==false);
 
            assert(!(cnt>19 || ( cnt==19 && fi>1) ));
        } else if(g==endd){
            if(is_neg){
                x= -x;
            }
 
            if(!(l <= x && x <= r))
            {
                cerr << l << ' ' << r << ' ' << x << '\n';
                assert(1 == 0);
            }
 
            return x;
        } else {
            assert(false);
        }
    }
}
string readString(int l,int r,char endd){
    string ret="";
    int cnt=0;
    while(true){
        char g=getchar();
        assert(g!=-1);
        if(g==endd){
            break;
        }
        cnt++;
        ret+=g;
    }
    assert(l<=cnt && cnt<=r);
    return ret;
}
long long readIntSp(long long l,long long r){
    return readInt(l,r,' ');
}
long long readIntLn(long long l,long long r){
    return readInt(l,r,'\n');
}
string readStringLn(int l,int r){
    return readString(l,r,'\n');
}
string readStringSp(int l,int r){
    return readString(l,r,' ');
}
 
 
/*
------------------------Main code starts here----------------------------------
*/
 
const int MAX_T = 1e5;
const int MAX_N = 1e5;
const int MAX_SUM_LEN = 1e5;
 
#define fast ios_base::sync_with_stdio(0); cin.tie(0); cout.tie(0)
#define ff first
#define ss second
#define mp make_pair
#define ll long long
#define rep(i,n) for(int i=0;i<n;i++)
#define rev(i,n) for(int i=n;i>=0;i--)
#define rep_a(i,a,n) for(int i=a;i<n;i++)
 
int sum_n = 0, sum_m = 0;
int max_n = 0, max_m = 0;
int yess = 0;
int nos = 0;
int total_ops = 0;


void solve()
{   
    int n = readIntSp(1,1e5);
    int k = readIntLn(1,1e5);
    
    string s = readStringLn(n,n);
    for(auto h:s) assert(h>='a' && h<='z');

    if(k==1){
        cout<<s<<'\n';
        return;
    }


    int done[n] = {0};
    vector<vector<int> > v(26,vector<int>());

    rep(i,n){
        v[(int)(s[i]-'a')].push_back(i);
    }

    string ans = "";
    int cnt=n;
    int l=0;
    int curr=0;

    while(k>1 && cnt){
        k--;

        curr=-1;

        while(l<26 && (v[l].empty() || v[l][0]>curr)){
            rep(i,v[l].size()){
                ans += (char)('a'+l);
                done[v[l][i]]=1;
            }
            cnt -= v[l].size();
            if(!v[l].empty()) curr = v[l].back();
            l++;
        }
        if(l<26){
            int tmp = v[l].back();
            while(v[l].back()>curr){
                ans += (char)('a'+l);
                done[v[l].back()]=1;
                v[l].pop_back();
                cnt--;
            }
            curr=max(curr, tmp);
        }
    }

    if(cnt){
        l=0;

        while(done[l]) l++;

        vector<int> vv;
        while(curr<n){
            if(done[curr]){
                curr++;
                continue;
            }
            if(s[curr]>s[l]){
                curr++;
                continue;
            }
            while(!vv.empty() && s[vv.back()]>s[curr]) vv.pop_back();
            vv.push_back(curr);
            curr++;
        }

        reverse(vv.begin(), vv.end());

        while(!vv.empty() && s[vv.back()]<s[l]){
            ans+=s[vv.back()];
            done[vv.back()]=1;
            vv.pop_back();
        }

        if(!vv.empty()){
            int l1=l;
            while(l<n && (s[l]==s[l1] || done[l])) l++;
            if(l==n || (l<n && s[l]>s[l1])){
                while(!vv.empty()){
                    ans+=s[vv.back()];
                    done[vv.back()]=1;
                    vv.pop_back();
                }
            }
        }

        rep(i,n){
            if(!done[i]) ans+=s[i];
        }


    }

    cout<<ans<<'\n';

}
 
signed main()
{

    #ifndef ONLINE_JUDGE
    freopen("input.txt", "r" , stdin);
    freopen("output.txt", "w" , stdout);
    #endif
    fast;
    
    int t = 1;
    
    t = readIntLn(1,1000);
    
    for(int i=1;i<=t;i++)
    {    
       solve();
    }
   
    //assert(getchar() == -1);
    assert(sum_n<=3e5);
 
    cerr<<"SUCCESS\n";
    cerr<<"Tests : " << t << '\n';
    cerr<<"Sum of lengths : " << sum_n <<'\n';
    cerr<<"Maximum length : " << max_n <<'\n';
    // cerr<<"Total operations : " << total_ops << '\n';
    //cerr<<"Answered yes : " << yess << '\n';
    //cerr<<"Answered no : " << nos << '\n';
}

Can someone please explain what is wrong in below code or provide testcases where the code fails

#include <bits/stdc++.h>
#define int long long int

void helper(std::string s, std::vector <std::pair<char,int>> &v, int &j, int k, std::string &x, std::string &y){
    char c = v[j].first;
    int idx = 0;
    for(auto i:s){
        if(i == c){
            break;
        }
        idx += 1;
    }
    x = s.substr(0,idx);
    y += c;
    j += 1;
    idx += 1;
    //std::cout << j << " " << idx << " " << y << " " << s << "\n";
    while(s[idx] != '\0'){
        if(s[idx] == v[j].first || (k-2 == 0 &&  !x.empty() && x[0] >= s[idx] && y.back() != v[j].first)){
            if(s[idx] == v[j].first)
                j += 1;
            y += s[idx];
        }
        else
            x += s[idx];
        idx += 1;
        //std::cout << j << " " << idx << " " << y << " " << x << "\n";
    }
}

void solve(){
    int n, k;
    std::cin >> n >> k;
    std::string s;
    std::cin >> s;
    std::vector <std::pair<char,int>> v;
    int j = 0;
    for(auto i:s){
        v.push_back({i, j});
        j += 1;
    }
    std::sort(v.begin(), v.end(), [](std::pair<char,int> a, std::pair<char,int> b){
        if(a.first == b.first)
            return a.second < b.second;
        return a.first < b.first;
    });
    //for(auto i:v){
        //std::cout << i.first << " " << i.second << "\n";
    //}
    j = 0;
    std::string x = "", y = "", ans = "";
    while(k-1 > 0 && j < n){
        //std::cout << s << " ";
        helper(s, v, j, k, x, y);
        //std::cout << x << " " << y << " " << j << "\n";
        ans += y;
        s = x;
        x = "";
        y = "";
        k -= 1;
    }
    //std::cout << s << " ";
    std::cout << ans+s << "\n";
}
     
signed main() {
    std::ios::sync_with_stdio(false);
    std::cin.tie(0);
    int t = 0;
    std::cin >> t;
    while(t--){
        solve();
    }
}
2
6 2
acbacc
6 2
acdacc

The answer should be:

aacbcc
aacccd
4 Likes

Thanks bro

@lavish_adm Your editorial is incomplete. What if \alpha = \beta in the case K = 2? It is impossible to decide whether to take \beta or not without looking at the rest of the symbols.

For example, consider s = "abcab" and s = "acbac". In the former case, the optimal partition is "aab" + "bc" (take second 'b'); but in the latter case, the optimal partition is "aa" + "cbc" (do not take second 'c').

Another example of s = "acdbac" with optimal partition "aac" + "cdb" shows that the argument

If \beta is the lexicographically smallest element in the remaining string, we will take \beta in the first subsequence. Otherwise, we will skip this character.

also does not work, as it is optimal to take 'c' which is not lex. min. (there is a 'b').

Please complete the editorial, as I genuinely don’t understand the specific details of your solution.

10 Likes

can you tell me what’s wrong with my code?
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int compare(const void *pa, const void *pb)
{
const char *p1 = pa;
const char *p2 = pb;
return *p1 - *p2;
}
int main()
{
int t;
scanf("%d", &t);
int n, k;
while (t > 0)
{
scanf("%d %d", &n, &k);
char string_sorted[n + 1];
scanf("%s", string_sorted);
char string_unsorted[n + 1];
strcpy(string_unsorted, string_sorted);

    qsort(string_sorted, n, sizeof(char), compare);

    int used_up[100000] = {0};

    int partition = 0;
    for (int i = 0; partition < k - 1; i++)
    {
        int end_index = 0;
        for (int j = 0; j < n; j++)
        {
            while (string_unsorted[j] == string_sorted[i] && used_up[j] == 0 && j < n && i < n)
            {
                printf("%c", string_unsorted[j]);
                used_up[j]++;
                end_index = j;
                j++;
                if(j<n)
                    i++;
            }
            //printf("%d",j);
            if(j==n)
                break;
            // if (j >= n)
            //     j = 0;
        }
        if (end_index < n - 1 && partition==k-2)
        {
            int count=0;
            for (int i = 0; i < n && end_index+1<n && count<1; i++)
            {
                while (string_unsorted[end_index + 1] <= string_unsorted[i] && used_up[i] == 0 && end_index+1<n && used_up[end_index+1]==0)
                {
                    printf("%c", string_unsorted[end_index+1]);
                    used_up[end_index+1]++;
                    // i++;
                    end_index++;
                    count=1;
                }
            }
        }

        partition++;
    }

    for (int i = 0; i < n; i++)
    {
        if (used_up[i] == 0)
            printf("%c", string_unsorted[i]);
    }
    printf("\n");

    t--;
}

return 0;

}

please help me with some testcases,
for these cases,
8
6 1
whizzy
14 2
aaacdecdebacde
4 3
dbca
1 4
z
11 3
jakeperalta
5 3
zebra
5 2
abcab
5 2
acbac

my output is:-
whizzy
aaaaccdecdebde
abcd
z
aaaeejkprlt
abrze
aabbc
aacbc

my solution link
https://www.codechef.com/viewsolution/59556376

What if α = β in the case K = 2?

Suppose that we have a string s such that:

s = β+r1+ch+r2

Lets assume that r1 and r2 are random strings such that exists a subsequence of βs that belongs to r2 and no character in r2 is smaller than β. Lets also assume that ch is the smallest character of string s. Without lost of generality, we can say that ch is unique.

Since ch is the smallest character in the string, our first choice for the first subsequence will be ch and then other possible options that belongs to r2.

Once we choose ch we don’t have more ch’s (since is unique) but we have βs in r2 which implies that α = β. Here we have two paths that can lead us to the optimal solution:

  1. Put all occurences of βs that are in r2 to our first subsequence and then put all remaining characters of s in the second subsequence. We will have a resulting string looking like ch+ββββ…+β+r1+r, where r is a subsequence of all characters of r2 that are not βs;

  2. Just put all remaining characters of s in the second subsequence. We will have a resulting string looking like ch+β+r1+r2.

We can easily generate both strings and return as answer the lexicografically smallest.

Proof of why we need to put all βs of r2 in the first subsequence in Path 1:

As showed above, if Path 1 lead us to the optimal answer the resulting string will look like ch+ββββ…+β+r1+r which implies that ββββ…+β+r1 is lexicografically smaller than β+r1 (that are our first choices in Path 2) which means that ββββ… is also lexicografically smaller than r1. Since in r2 doesn’t exist any character smaller than β, leaving one β out of first subsequence is losing the oportunity to make the resulting string smaller.

Hey, here is one of the cases where your code fails:

1
5 2
bbacs

The correct answer here would be by dividing the string into “bb” & “acs” and then rearranging it into

acsbb

But, your code outputs:

abbcs

Hey, your code has been scrambled while pasting. It won’t even compile. Please paste the whole code at once and comment it again so I can help you.

Hey, the reason your code is failing is because there are some logical errors in it.
For example in this case:

1
7 3
aapbaac

your code outputs:

aaaacbp

while the output should be:

aaaabcp

Your code consider adding ‘c’ in the first iteration itself, even when k was large enough that you ‘b’ could have been placed before ‘c’.