STRKSORT - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3

Author: Evgeny
Tester: Rahul Dugar
Editorialist: Aman Dwivedi

DIFFICULTY:

Medium

PREREQUISITES:

Observation

PROBLEM:

You are given a string S of length N and an integer K. Let’s consider all strings that can be obtained by erasing K consecutive characters from the original string. Let f(i) be the string that is obtained by erasing the characters at positions i, i+1, \ldots, i + K - 1.

Consider a permutation P of the integers 1 through N - K + 1. You need to find the lexicographically smallest permutation P such that f(P_i) \leq f(P_{i+1}).

Since the length of this permutation can be too large, you should only compute its hash: \sum_{i=1}^{N-K+1} P_i \cdot i modulo 10^9 + 7.

EXPLANATION:

We are given a string S of length N and an integer K. Let’s consider all the strings that can be obtained by erasing K consecutive characters from the original string S as F_1, F_2, F_3, \dots F_M.

Claim 1: There are at most two characters possible for index j of any string F_i.

Proof

Let us find the possibility of j^{th} character for any string F_i. There are two cases possible:

Case 1: i>j

The string F_i is obtained when K consecutive characters are removed from the string S starting from index i. Since i>j hence the j^{th} character will be present before the index i and hence it won’t be affected due to the removal of K consecutive characters.

So, the j^{th} character of such string is the same as that of the j^{th} character of the original string S i.e

F[i][j]=S[j]

Case 2: i \le j

Here i \le j, it means that the j^{th} character of such strings will be affected due to the removal of K consecutive characters. Let us try to find the j^{th} character of such strings:

We can take i-1 characters from the beginning, and the remaining characters after deletion of K consecutive characters. Hence it will come from the index:

(i-1)+ K + (j-(i-1))
j+K

So, the j^{th} character of such string is equal to the (j+k)^{th} character of the original string S i.e

F[i][j]=S[j+K]

Hence, there are almost two characters possible for index j of any string F_i

From the above proof, we can also say that the j^{th} character of strings F_1 to F_j is S_{j+K} while for strings F_{j+1} to F_M is S_j.

Claim 2: If strings F_1 to F_i has same prefix of length i-1, then the strings will be identical.

Proof

As we had already proved that i^{th} character of strings F_1 to F_i will be same i.e S_{i+K}. For any index j \ge i, the character present at index j in any of the strings F_1 to F_i will be same i.e S_{j+K}. So all the characters from index [i,M] are same and the prefix of length i-1 for these strings are already same.

It means that all the characters of these strings are same, hence all such strings are identical.

Now from Claim 1 we know that at index j of any string F_i there are only two characters possible that are S_{j+K} (for strings F_1 to F_j) and S_j (for strings F_{j+1} to F_M). Hence we need to match only the j^{th} character with (j+K)^{th} character in original strings.

If the characters at both the index comes out to be same then we say that all the strings that can be obtained have the same character at their index j. Let us try to find such indices, j where the j^{th} character doesn’t matches with (j+K)^{th} character.

Suppose we have a string S and the indices where the characters are not same are x_1, x_2, \dots x_m. Let us try to see what happens at index x_1:

Index x_1:

  • This is the first index where the characters are not identical i.e. S[x_1] \neq S[x_1+K]. It means that the prefix of length (x_1-1) is same for strings F_1 to F_{x_{1}}. Hence from Claim 2 we say that strings F_1 to F_{x_{1}} are identical.

  • From Claim 1, we say that x_1^{th} character of strings F_1 to F_{x_{1}} is S_{x_1+K} while for strings F_{x_1+1} to F_M is S_{x_1}.

  • If S_{x_1+K} > S_{x_1} we say that the strings F_1 to F_{x_{1}} are lexicographically larger than all the remaining strings i.e. F_{x_1+1} to F_M. Hence we place all these strings greedily from the bottom of the permutation array where vacant space is found.

  • If S_{x_1+K} < S_{x_1} we say that the strings F_1 to F_{x_{1}} are lexicographically smaller than all the remaining strings i.e. F_{x_1+1} to F_M. Hence we place all these strings greedily from the bottom of the permutation array where vacant space is found.

Now all the strings F_1 to F_{x_{1}} are placed and the remaining strings has same prefix of length x_1. We can similarly proceed for the remaining indices and fill up our permutation table.

Once we have operated in all such indices, we compute the hash of the permutation table and output that hash value.

TIME COMPLEXITY:

O(N) per test case

SOLUTIONS:

Setter
#include<bits/stdc++.h>
 
using namespace std;
 
typedef long long ll;
int const maxn = 1e7 + 5;
int ans[maxn];
ll mod = 1e9 + 7;
 
main() {
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    cout.tie(0);
    int t;
    cin >> t;
    while (t--) {
        int n, k;
        cin >> n >> k;
        string s;
        cin >> s;
        s = "#" + s;
        int l = 1, r = n - k + 1;
        for (int i = 1; i <= n - k + 1;) {
            int pref = i;
            while (pref + k <= n && s[pref] == s[pref + k]) pref++;
            if (pref + k == n + 1 || s[pref + k] < s[pref]) {
                for (int j = i; j <= pref; ++j) ans[l++] = j;
            }
            else {
                for (int j = pref; j >= i; --j) ans[r--] = j;
            }
            i = pref + 1;
        }
        ll out = 0;
        for (int i = 1; i <= n - k + 1; ++i) {
            out = (out + (ll)i * (ll)ans[i]) % mod;
        }
        cout << out << '\n';
    }
    return 0;
}
 
Tester
#pragma GCC optimize("Ofast")
#include <bits/stdc++.h>
using namespace std;
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
#include <ext/rope>
using namespace __gnu_pbds;
using namespace __gnu_cxx;
#ifndef rd
#define trace(...)
#define endl '\n'
#endif
#define pb push_back
#define fi first
#define se second
#define int long long
typedef long long ll;
typedef long double f80;
typedef pair<int,int> pii;
typedef pair<double,double> pdd;
//#define double long double
#define pll pair<ll,ll>
#define sz(x) ((long long)x.size())
#define fr(a,b,c) for(int a=b; a<=c; a++)
#define rep(a,b,c) for(int a=b; a<c; a++)
#define trav(a,x) for(auto &a:x)
#define all(con) con.begin(),con.end()
const int infi=0x3f3f3f3f;
const ll infl=0x3f3f3f3f3f3f3f3fLL;
//const int mod=998244353;
const int mod=1000000007;
typedef vector<int> vi;
typedef vector<ll> vl;
 
typedef tree<int, null_type, less<int>, rb_tree_tag, tree_order_statistics_node_update> oset;
auto clk=clock();
mt19937_64 rang(chrono::high_resolution_clock::now().time_since_epoch().count());
int rng(int lim) {
	uniform_int_distribution<int> uid(0,lim-1);
	return uid(rang);
}
int powm(int a, int b) {
	int res=1;
	while(b) {
		if(b&1)
			res=(res*a)%mod;
		a=(a*a)%mod;
		b>>=1;
	}
	return res;
}
 
 
long long readInt(long long l, long long r, char endd) {
	long long x=0;
	int cnt=0;
	int fi=-1;
	bool is_neg=false;
	while(true) {
		char g=getchar();
		if(g=='-') {
			assert(fi==-1);
			is_neg=true;
			continue;
		}
		if('0'<=g&&g<='9') {
			x*=10;
			x+=g-'0';
			if(cnt==0) {
				fi=g-'0';
			}
			cnt++;
			assert(fi!=0 || cnt==1);
			assert(fi!=0 || is_neg==false);
 
			assert(!(cnt>19 || ( cnt==19 && fi>1) ));
		} else if(g==endd) {
			if(is_neg) {
				x=-x;
			}
			assert(l<=x&&x<=r);
			return x;
		} else {
			assert(false);
		}
	}
}
string readString(int l, int r, char endd) {
	string ret="";
	int cnt=0;
	while(true) {
		char g=getchar();
		assert(g!=-1);
		if(g==endd) {
			break;
		}
		cnt++;
		ret+=g;
	}
	assert(l<=cnt&&cnt<=r);
	return ret;
}
long long readIntSp(long long l, long long r) {
	return readInt(l,r,' ');
}
long long readIntLn(long long l, long long r) {
	return readInt(l,r,'\n');
}
string readStringLn(int l, int r) {
	return readString(l,r,'\n');
}
string readStringSp(int l, int r) {
	return readString(l,r,' ');
}
 
int sum_n=0;
void solve() {
	int n,k;
	n=readIntSp(1,10000000),k=readIntLn(1,n);
	sum_n+=n;
	assert(sum_n<=10000000);
	string s=readStringLn(n,n);
	for(char i:s)
		assert('a'<=i&&i<='z');
	deque<int> pp;
	bool small=1;
	pp.push_back(sz(s)-k+1);
	vi latom;
	for(int i=sz(s)-1; i>=k; i--) {
		if(s[i]<s[i-k]) {
			while(sz(latom)) {
				pp.push_back(latom.back());
				latom.pop_back();
			}
			small=1;
			pp.push_front(i-k+1);
		} else if(s[i]>s[i-k]) {
			while(sz(latom)) {
				pp.push_back(latom.back());
				latom.pop_back();
			}
			small=0;
			latom.pb(i-k+1);
		} else {
			if(small==0)
				latom.pb(i-k+1);
			else
				pp.push_front(i-k+1);
		}
	}
	while(sz(latom)) {
		pp.push_back(latom.back());
		latom.pop_back();
	}
	int troll=0,p=1;
	for(int i:pp)
		troll=(troll+(i*p++))%mod;
	cout<<troll<<endl;
}
signed main() {
	ios_base::sync_with_stdio(0),cin.tie(0);
	srand(chrono::high_resolution_clock::now().time_since_epoch().count());
	cout<<fixed<<setprecision(10);
	int t=readIntLn(1,100000);
//	int t;
//	cin>>t;
	fr(i,1,t)
		solve();
	assert(getchar()==EOF);
#ifdef rd
	cerr<<endl<<endl<<endl<<"Time Elapsed: "<<((double)(clock()-clk))/CLOCKS_PER_SEC<<endl;
#endif
}
 
Editorialist
#include<bits/stdc++.h>
using namespace std;

#define int long long

const int mod=1e9+7;

void solve()
{
  int n,k;
  cin>>n>>k;

  string s;
  cin>>s;

  vector <pair<int,int>> bad_index;

  for(int i=0;i<n;i++)
  {
    if(i+k<n && s[i]!=s[i+k])
    {
      if(s[i+k]>s[i])
        bad_index.push_back({i,-1});
      else
        bad_index.push_back({i,1});
    }
  }

  bad_index.push_back({n-k,1});

  int ans[n-k+1];
  int l=0,r=n-k;

  int start=0;

  for(auto itr: bad_index)
  {
    if(itr.second==1)
    {
      for(int i=start;i<=itr.first;i++)
        ans[l++]=i+1;
    }
    else
    {
      for(int i=itr.first;i>=start;i--)
        ans[r--]=i+1;
    }

    start=itr.first+1;
  }

  int sum=0;

  for(int i=1;i<=n-k+1;i++)
    sum=(sum+(i*ans[i-1]))%mod;

  cout<<sum<<"\n";
}

int32_t main()
{
  ios_base::sync_with_stdio(0);
  cin.tie(0);

  int t;
  cin>>t;

  while(t--)
    solve();

return 0;
}

Can anyone please explain/elaborate the problem statement?

Such a nice question, seemed complex at first, but if you visualize how the strings needed to be compared and understand it and handle the exception case, so simple to code.
Liked it a lot :+1:

1 Like