SPECIALSTR Editorial

PROBLEM LINK:

Setter: Ashish Gupta
Tester: Aryan Chaudhary
Editorialist: Rishabh Gupta

DIFFICULTY:

Easy-medium

PREREQUISITES:

Binary Lifting

PROBLEM:

Consider the following string of infinite length: abc...xyzabc...xyz...

Let string B be the prefix of this string of length M. For e.g. if M=6, B= abcdef.

Vedansh calls a string X of length K special if it satisfies the following conditions:

  • B is a substring of X. Formally, for some 1≤L≤R≤K , X_{L...R}=B

Vedansh has string A of length N (consisting of lowercase Latin letters only) and he wants to convert it into a special string. To do so he can perform the following operation:

  • Pick an i such that 1≤i≤|A| and delete A_i. The remaining parts of A are concatenated.

Since Vedansh is lazy, he wants to do this in minimum number of operations or determine if it can not be converted to a special string. Help Vedansh in doing so.

QUICK EXPLANATION:

What if we fix the position of the first character of B in X?

We will pick the remaining characters greedily by choosing the next character at its next occurrence. And thereby, the subsequence corresponding to B gets fixed. We can just delete the unwanted characters between the first and last character of B.

How to check optimally for all starting positions?

If we keep picking the next occurrence in X of the next character of B, we’ll have to make O(n) computations each time. To do this optimally we’ll have to store some information prior to this step.
We can store the position of later characters in multiple jumps.

Which technique and data structure to use to perform these operations optimally?

The binary lifting technique and the corresponding dp values provide this functionality to check positions of characters in jumps of 1,2,4 and so on. We can use this to find the position of character after M-1 positions and then pick the result that uses the minimum number of delete operations.

EXPLANATION:

The string B will appear in string X as a substring if there is a subsequence B in the string X. So corresponding to a particular subsequence B, we will have the delete the unwanted characters in between the first and last character to obtain B as a substring. We have to pick such a subsequence in which the difference between the position of the first and last character is minimum, thereby leading to the minimum deletions from X.

If we fix the position of the first character of B in X(in this case ‘a’), the next character should be chosen as near as possible to minimize the deleted characters. If we proceed in the above manner the whole subsequence gets fixed and hence the position of characters of B.
So we want to fix the first character at every position possible and check the difference b/w the first and last character of the resultant subsequence and pick the one with the least difference, hence the minimum deletions. If we carry out these operations individually and directly we would require O(n) time at each step giving time limit errors.

To carry out these operations optimally we have to make some calculations and store some info prior to this step. In the naive approach, we calculate the position of the next element every time and repeat this step. By invoking our knowledge of binary lifting, we try to calculate the position of the next to next element, the 4th next element, and so on. This can be easily calculated using the dynamic programming technique.

After storing these values we can calculate the (m-1)th next element from each starting point using the binary lifting technique and pick the subsequence that requires the minimum deletions.

TIME COMPLEXITY:

O(nlogn)

SOLUTION:

Setter's Solution
// Intended solution without input validation
// Use for editorial

#include <bits/stdc++.h>
using namespace std;

#define int long long
#define endl "\n"
#define sz(w) (int)(w.size())
using pii = pair<int, int>;

const long long INF = 1e18;

const int N = 1e6 + 5; 

void solve()
{
    int n; cin >> n;
    string s; cin >> s;
    s = '#' + s;
    int m; cin >> m;
    vector<int> occ(26, n + 1), par(n + 1);
    for(int i = n; i >= 1; i--)
    {
        int nxt = (s[i] - 'a' + 1) % 26;
        par[i] = occ[nxt];
        occ[s[i] - 'a'] = i; 
    }
    int lg = log2(m) + 1;
    vector<vector<int>> dp(lg + 1, vector<int>(n + 2, n + 1));
    for(int i = n; i >= 1; i--)
    {
        dp[0][i] = par[i];
        for(int j = 1; j <= lg; j++)
        {
            dp[j][i] = dp[j - 1][dp[j - 1][i]];
        }
    }
    int ans = INF;
    for(int i = 1; i <= n; i++)
    {
        if(s[i] != 'a')
            continue;
        int u = i, m_copy = m - 1;
        for(int j = lg; j >= 0; j--)
        {
            if((1 << j) <= m_copy)
                u = dp[j][u], m_copy -= 1 << j;
        }
        if(u != n + 1)
            ans = min(ans, u - i + 1 - m);
    }
    if(ans == INF)
        ans = -1;
    cout << ans << endl;
}

int32_t main()
{
    ios::sync_with_stdio(0); 
    cin.tie(0);
    int T; cin >> T;
    for(int tc = 1; tc <= T; tc++)
    {
        // cout << "Case #" << tc << ": ";
        solve();
    }
    return 0;
}
Editorialist''s Solution
#include<bits/stdc++.h>
using namespace std;
#define ll long long
#define dd double
#define endl "\n"
#define pb push_back
#define all(v) v.begin(),v.end()
#define mp make_pair
#define fi first
#define se second
#define vll vector<ll>
#define pll pair<ll,ll>
#define fo(i,n) for(int i=0;i<n;i++)
#define fo1(i,n) for(int i=1;i<=n;i++)
ll mod=1000000007;
ll n,k,t,m,q,flag=0;
ll power(ll a,ll b) {ll res=1;a%=mod; assert(b>=0); for(;b;b>>=1){if(b&1)res=res*a%mod;a=a*a%mod;}return res;}
// #include <ext/pb_ds/assoc_container.hpp> 
// #include <ext/pb_ds/tree_policy.hpp> 
// using namespace __gnu_pbds; 
// #define ordered_set tree<int, null_type,less<int>, rb_tree_tag,tree_order_statistics_node_update>
// ordered_set s ; s.order_of_key(a) -- no. of elements strictly less than a
// s.find_by_order(i) -- itertor to ith element (0 indexed)
ll min(ll a,ll b){if(a>b)return b;else return a;}
ll max(ll a,ll b){if(a>b)return a;else return b;}
ll gcd(ll a , ll b){ if(b > a) return gcd(b , a) ; if(b == 0) return a ; return gcd(b , a%b) ;}
int main() {
    ios_base::sync_with_stdio(0);
    cin.tie(0); cout.tie(0);
    #ifdef NOOBxCODER
    freopen("input.txt", "r", stdin);
    freopen("output.txt", "w", stdout);
    #else 
    #define NOOBxCODER 0
    #endif
    cin>>t;
    //t=1;
    while(t--){
        cin>>n;
        string s;
        cin>>s;
        cin>>m;

        vector<int>v[26];


        fo(i,n){
            v[s[i]-'a'].pb(i);
            //cout<<s[i]-'a'<<" "<<i<<endl;
        }

        int itr[26]={0};

        int nxt[n];
        fo(i,n){
            int j = (s[i]-'a' + 1 )%26; //cout<<j<<endl;
            while(1){
                if(v[j].size() == itr[j]){nxt[i]=-1; break;}
                if(v[j][itr[j]] <= i)itr[j]++;
                else{
                    nxt[i]= v[j][itr[j]];
                    break;
                }
            }
        }
        int dp[n][20];

        fo(i,n)dp[i][0]=nxt[i]; //cout<<nxt[i]<<" ";}cout<<endl;
        fo1(j,19){
            fo(i,n){
                if(dp[i][j-1]==-1)dp[i][j]=-1;
                else{
                    dp[i][j] = dp[dp[i][j-1]][j-1];
                }
                //if(j==1)cout<<dp[i][1]<<" ";
            }
        }
        //cout<<endl;


        int ans = 1e9;

        fo(i,n){
            if(s[i]!='a')continue;
            int st = i,k=m-1;
            fo(j,20){
                if(k%2==1)st = dp[st][j];
                if(st == -1) break;
                k/=2;
            }

            if(st!=-1)ans=min(ans, st-i+1 );
        }

        if(ans==1e9)cout<<-1<<endl;
        else cout<< ans- m<<endl;





        
    }
    cerr << "Time : " << 1000 * ((double)clock()) / (double)CLOCKS_PER_SEC << "ms\n";
    return 0;
}

14 Likes

Nice problem and Nice editorial :slight_smile:

1 Like

One of the best problems I have seen in a while!

4 Likes

I Loved it! Felt great after solving it!

1 Like

Completely agreed!!

1 Like

Is there a clean way to optimize this approach?
(It is getting TLE on the last 2 test cases).
My Submission.
Thanks.

1 Like

I tried Longest Common Subsequence DP approach but I got only 3 test cases working with that and rest either gave WA or TLE

Can anyone help me in identifying why I’m getting TLE although i used binary lifting?
@rishabhdevil
Code:

//here
ll is long long int
V is vector
this code is for just one case not for t test cases..
//

    ll n, m;
    cin >> n;
    string s;
    cin >> s >> m;
    V<ll> nxt(n+1);
    nxt[n] = n;
    ll l = ceil(log2(n));
    V<V<ll>> up(n+1,V<ll>(l+1)), idx(26);
    for(int i = 0; i < n; i++)
        idx[s[i]-'a'].pb(i);
    for(int i = 0; i < 26; i++) idx[i].pb(n);
    for(int i = 0; i < n; i++){
        ll j = s[i]-'a';
        j = (j+1)%26;
        ll ind = lower_bound(idx[j].begin(), idx[j].end(), i)-idx[j].begin();
        nxt[i] = idx[j][ind];
    }

    for(int j = 0; j <= l; j++){
        for(int i = 0; i < n; i++){
            up[i][j] = nxt[i];
            nxt[i] = nxt[nxt[i]];
        }
    }

    ll res = n+1;
    for(int i = 0; i < n; i++){
        ll tmp = m-1;
        ll u = i;
        for(int j = l; j >= 0; j--){
            if((1<<j) <= tmp){
                u = up[u][j];
                tmp -= (1<<j);
            }
            if(u == n)  break;
        }
        //cerr << u << " ";
        if(tmp > 0 || u == n)  continue;
        if((s[u]-'a') == (m-1+26)%26)
            res = min(res, u-i+1-m);
    }
    if(res == n+1)  res = -1;
    cout << res << endl;

THANK YOU!!!

Kudos to @ashishgup, Really nice problem!