 # PASHA - Editorial

Author: neerav_1
Testers: iceknight1093, rivalq
Editorialist: iceknight1093

TBD

# PREREQUISITES:

Prefix sums, maps

# PROBLEM:

You’re given a string A and an integer K.
A character c is called good with respect to a substring B of A if
2\cdot \text{freq}(B, c) - |B| = K

Find the sum of the count of good characters across all substrings.

# EXPLANATION:

Instead of fixing a substring and counting the number of good characters it has, let’s fix a character and count the number of substrings it’s good for.

To this end, let’s analyze what it means for character c to be good for substring A[i\ldots j].

Analysis

The length of this substring is j-i+1, so we want
2\cdot \text{freq}(A[i\ldots j], c) - (j-i+1) = K.

Let \text{pref}_c[i] be the number of occurrences of c in A[1\ldots i].
We have \text{freq}(A[i\ldots j], c) = \text{pref}_c[j] - \text{pref}_c[i-1], so the above equation then reduces to

2\cdot(\text{pref}_c[j] - \text{pref}_c[i-1]) - (j-i+1) = K

which with a bit of rearrangement becomes

(2\cdot \text{pref}_c[j] - j) - K = 2\cdot\text{pref}_c[i-1] - (i-1)

Notice that the left side depends only on j, while the right side depends only on i.
We’ll use this fact to compute the answer.

For a fixed character c, we want to compute the number of pairs (i, j) such that the equation

(2\cdot \text{pref}_c[j] - j) - K = 2\cdot\text{pref}_c[i-1] - (i-1)

holds. Also, we need to ensure that c actually occurs in A[i\ldots j].

This can be done using a map, as follows:

• Let \text{ct} be a map, initially empty.
• \text{ct}[x] will denote the number of indices i so far such that 2\cdot\text{pref}_c[i-1]-(i-1) = x
• Iterate j from 1 to N.
• For a fixed j, the number of valid i \leq j is simply \text{ct}[(2\cdot \text{pref}_c[j] - j) - K], because that’s exactly what our map holds. So, add that value to the overall answer.
• Now we need to update our map with index j, which corresponds to increasing \text{ct}[2\cdot \text{pref}_c[j] - j]. by 1.

This almost solves the problem: the only thing we haven’t done is account for the fact that c needs to occur in the range.
Notice that, in terms of \text{pref}_c, we essentially need to ensure that \text{pref}_c[i-1] \lt \text{pref}_c[j].

There are several ways to accomplish this.
One way is to delay updates to the map. That is,

• Keep a list of updates that need to be performed, say L. Initially, L = , corresponding to 2\cdot \text{pref}_c - 0 obtained from i = 0 (note that we’re treating the string as 1-indexed here, so index 0 is ‘before’ the string starts).
• At index j, if S_j = c, perform all updates that are pending in L, and then clear L. This is because S_j = c means \text{pref}_c[j] is higher than any value before it.
• Afterwards, append 2\cdot \text{pref}_c[j] - j to L as a pending update.

By delaying updates in this manner, we obtain exactly what we need: the number of substrings for which character c is good.
Notice that while we may perform many updates at once, their total number is still at most N, since we do at most one for each index.
For a fixed c, this takes \mathcal{O}(N) or \mathcal{O}(N\log N) time depending on the type of map used.

Doing this separately for each c from a to z gives us a solution in \mathcal{O}(26N\log N).

# TIME COMPLEXITY

\mathcal{O}(26N) or \mathcal{O}(26N\log N) per test case.

# CODE:

Setter's code (C++)
#include <bits/stdc++.h>
using namespace std;

long long int goodCharacters(int N, int K, string &s)
{
long long int ans = 0;
for(char i = 'a'; i <= 'z'; ++i)
{
unordered_map<int, int> mp;
if(s==i)
{
mp = 1;
}
int temp = 0;
int index = 0;
while(index < N)
{
int j = index;
int temp1 = temp;
while(j < N && (j==index || s[j]!=i))
{
if(s[j]==i)
{
temp1++;
}
else
{
temp1--;
}
if(mp.find(temp1-K)!=mp.end())
{
ans+=mp[temp1-K];
}
j++;
}
j = index;
while(j < N && (j==index || s[j]!=i))
{
if(s[j]==i)
{
temp++;
}
else
{
temp--;
}
mp[temp]++;
j++;
}
if(j==N)
{
break;
}
if(mp == 0)
mp = 1;
index = j;
}
}
return ans;
}

void solve()
{
int n;
cin>>n;
assert(1<=n&&n<=1e5);
int k;
cin>>k;
assert(-1e5<=k&&k<=1e5);
string s;
cin>>s;
for(int i = 0; i < n; ++i)
{
assert('a'<=s[i]&&s[i]<='z');
}
long long int ans = goodCharacters(n, k, s);
cout<<ans<<endl;
}

int main() {
int t;
cin>>t;
assert(1<=t && t<=100);
while(t--)
{
solve();
}
}

Editorialist's code (Python)
import string
for _ in range(int(input())):
n, k = map(int, input().split())
s = input()
ans = 0
for c in string.ascii_lowercase:
mp, pref = {}, 0
to_insert = 
for i in range(n):
if s[i] == c:
pref += 1
for x in to_insert:
if x in mp: mp[x] += 1
else: mp[x] = 1
to_insert.clear()

if 2*pref - i - k in mp: ans += mp[2*pref - i - k]
to_insert.append(2*pref - i)
print(ans)

1 Like

hello @iceknight1093 ,
first of all thanks for these super editorials.

can you please summarize the setter’s solution especially temp and temp1 var
i know he/she followed the same idea but still not upto the mark.

The idea is the exact same, really.

temp1 in the code is the value of 2\cdot\text{pref}_c[j]-j. You can see that when you move from j to j+1, this quantity either increases by 1 or decreases by 1, depending on whether S_{j+1} equals c or not.
So instead of computing prefix sums independently, the setter’s code computes this value directly.
temp is also the exact same quantity.

It’s just a slight difference in implementation: I chose to delay updates by putting them all into a vector and performing everything at the same time when necessary; while the setter instead uses a 2-pointer-like approach.
You can see that the first while loop computes the answers for a bunch of indices, and then the second while loop updates the values for the exact same set of indices into the map.

helpful as always @iceknight1093 why you are storing ‘1’ in the array used for storing update in beginning for each character ?

My code uses 0-indexing, so that corresponds to 2\cdot\text{pref}_c[-1] - (-1) = 0+1 = 1, which is needed to consider prefixes as subarrays.

If you use 1-indexing you’d store 0 instead.