P5149 - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: pols_agyi_pols
Tester: kingmessi
Editorialist: iceknight1093

DIFFICULTY:

TBD

PREREQUISITES:

Dynamic programming

PROBLEM:

For a binary string S, define f(S) as follows:

  • |S|-1 times, delete one character from S.
    Then, add to the score the number of pairs of adjacent characters that differ.

You’re given S, compute the sum of f(S[i\ldots j]) across all 1 \leq i \leq j \leq N, i.e, across all subtstrings of S.

EXPLANATION:

It is recommended to read the editorial of the easy version first.

Recall that the score of a string of length N, with k adjacent different characters initially, is

(N-1-k)\cdot k + \left(0 + 1 + 2 + \ldots + k-1 \right)

The key idea behind solving this version of the problem, is noticing how this value changes when we append a character to S.

Suppose S currently ends with a 0. Then,

  • If we append a 0 to S, N increases by 1 but k doesn’t change.
    So, the score increases by exactly k.
  • If we append a 1 to S, N and k both increase by 1.
    Let k_0 denote the old value of k, and k_1 denote the new one (k_1 = k_0 + 1).
    Similarly, let N_0 and N_1 denote the old and new values of N.
    We see that:
    • N_1 - 1 - k_1 = (N_0 + 1) - 1 - (k_0 + 1) = N_0 - 1 - k_0.
    • So, (N_1 - 1 - k_1)\cdot k_1 = (N_0 - 1 - k_0)\cdot (k_0 + 1) = (N_0-1-k_0)\cdot k_0 + (N_0-1-k_0).
    • (0+1+\ldots + k_1-1) = (0+1+\ldots+k_0-1) + k_0
    • So, the overall change in the sum is (N_0-1-k_0) + k_0 = N_0 - 1.

tl;dr when we append a new character,

  1. If the new character equals the previous last character, the score increases by k.
  2. Otherwise, the score increases by the initial length minus 1.

Let’s define a_i to be the sum of scores of all substrings ending at index i.
We’ll compute a_i from a_{i-1}, using the observation above.

There are two cases.

First, suppose S_i \neq S_{i-1}.
Since the last character is different from the previous one, for any substring, the increase in score is its length minus 1.
We have one substring of every length from 1 to i-1, so the overall increase in score across them all is

0+1+2+\ldots + i-2 = \frac{(i-2)\cdot (i-1)}{2}

So, we obtain a_i = a_{i-1} + \frac{(i-2)\cdot (i-1)}{2}

Next, suppose S_i = S_{i-1}.
The last character is the same as the previous one, so the increase in score is the value of k.
So, we need to know the sum of k-values of all substrings ending at index i-1.
Let’s store this too, say as the value b_i.
We then obtain a_i = a_{i-1} + b_{i-1}.

The array b can be updated by similarly considering two cases:

  1. If S_i \neq S_{i-1}, the k-value of every substring increases by 1.
    So, b_i = b_{i-1} + i-1.
  2. Otherwise, the k-values of every substring don’t change; so b_i = b_{i-1}.

Finally, since a_i represents the sum of scores of all substrings ending at index i, the final answer is the sum of the array a.

TIME COMPLEXITY:

\mathcal{O}(N) per testcase.

CODE:

Author's code (C++)
#include <bits/stdc++.h>
using namespace std;
#define ll long long

ll mod=1000000007;

int main() {
	ll tt=1;
	cin>>tt;
    while(tt--){
        ll n;
        cin>>n;
        string s;
        cin>>s;
        ll ans=0,last=0,diff=0,diff2=0,cnt;
        for(int i=1;i<n;i++){
            if(s[i]==s[i-1]){
                cnt=last+diff;cnt%=mod;
            }else{
                cnt=last+diff2;cnt%=mod;
                diff+=i;diff%=mod;
            }
            diff2+=i;diff2%=mod;
            ans+=cnt;ans%=mod;
            last=cnt;
        }
        cout<<ans<<"\n";
    }
    return 0;
}
Tester's code (C++)
//Har Har Mahadev
#include<bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp> // Common file
#include <ext/pb_ds/tree_policy.hpp>
#define ll long long
#define int long long
#define rep(i,a,b) for(int i=a;i<b;i++)
#define rrep(i,a,b) for(int i=a;i>=b;i--)
#define repin rep(i,0,n)
#define precise(i) cout<<fixed<<setprecision(i)
#define vi vector<int>
#define si set<int>
#define mii map<int,int>
#define take(a,n) for(int j=0;j<n;j++) cin>>a[j];
#define give(a,n) for(int j=0;j<n;j++) cout<<a[j]<<' ';
#define vpii vector<pair<int,int>>
#define db double
#define be(x) x.begin(),x.end()
#define pii pair<int,int>
#define pb push_back
#define pob pop_back
#define ff first
#define ss second
#define lb lower_bound
#define ub upper_bound
#define bpc(x) __builtin_popcountll(x) 
#define btz(x) __builtin_ctz(x)
using namespace std;

using namespace __gnu_pbds;

typedef tree<int, null_type, less<int>, rb_tree_tag,tree_order_statistics_node_update> ordered_set;
typedef tree<pair<int, int>, null_type,less<pair<int, int> >, rb_tree_tag,tree_order_statistics_node_update> ordered_multiset;

const long long INF=1e18;
const long long M=1e9+7;
const long long MM=998244353;
  
int power( int N, int M){
    int power = N, sum = 1;
    if(N == 0) sum = 0;
    while(M > 0){if((M & 1) == 1){sum *= power;}
    power = power * power;M = M >> 1;}
    return sum;
}

struct input_checker {
    string buffer;
    int pos;

    const string all = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
    const string number = "0123456789";
    const string upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
    const string lower = "abcdefghijklmnopqrstuvwxyz";

    input_checker() {
        pos = 0;
        while (true) {
            int c = cin.get();
            if (c == -1) {
                break;
            }
            buffer.push_back((char) c);
        }
    }

    int nextDelimiter() {
        int now = pos;
        while (now < (int) buffer.size() && !isspace(buffer[now])) {
            now++;
        }
        return now;
    }

    string readOne() {
        assert(pos < (int) buffer.size());
        int nxt = nextDelimiter();
        string res;
        while (pos < nxt) {
            res += buffer[pos];
            pos++;
        }
        return res;
    }

    string readString(int minl, int maxl, const string &pattern = "") {
        assert(minl <= maxl);
        string res = readOne();
        assert(minl <= (int) res.size());
        assert((int) res.size() <= maxl);
        for (int i = 0; i < (int) res.size(); i++) {
            assert(pattern.empty() || pattern.find(res[i]) != string::npos);
        }
        return res;
    }

    int readInt(int minv, int maxv) {
        assert(minv <= maxv);
        int res = stoi(readOne());
        assert(minv <= res);
        assert(res <= maxv);
        return res;
    }

    long long readLong(long long minv, long long maxv) {
        assert(minv <= maxv);
        long long res = stoll(readOne());
        assert(minv <= res);
        assert(res <= maxv);
        return res;
    }

    auto readInts(int n, int minv, int maxv) {
        assert(n >= 0);
        vector<int> v(n);
        for (int i = 0; i < n; ++i) {
            v[i] = readInt(minv, maxv);
            if (i+1 < n) readSpace();
        }
        return v;
    }

    auto readLongs(int n, long long minv, long long maxv) {
        assert(n >= 0);
        vector<long long> v(n);
        for (int i = 0; i < n; ++i) {
            v[i] = readLong(minv, maxv);
            if (i+1 < n) readSpace();
        }
        return v;
    }

    void readSpace() {
        assert((int) buffer.size() > pos);
        assert(buffer[pos] == ' ');
        pos++;
    }

    void readEoln() {
        assert((int) buffer.size() > pos);
        assert(buffer[pos] == '\n');
        pos++;
    }

    void readEof() {
        assert((int) buffer.size() == pos);
    }
}inp;

int smn = 0;
 
void solve()
{
    int n;
    // cin >> n;
    n = inp.readInt(1,200'000);
    inp.readEoln();
    string s;
    // cin >> s;
    s = inp.readString(n, n, "01");
    inp.readEoln();
    int ans = 0;
    vi v;
    rep(i,1,n){
    	if(s[i] != s[i-1])v.pb(1);
    	else v.pb(0);
    }
    n--;
    vi pf = v,sf = v;
    rep(i,1,n)pf[i] += pf[i-1],pf[i] %= M;vi pf1 = pf;
    rep(i,1,n)pf1[i] += pf1[i-1],pf1[i] %= M;
    rrep(i,n-2,0)sf[i] += sf[i+1],sf[i] %= M;vi sf1 = sf;
    rrep(i,n-2,0)sf1[i] += sf1[i+1],sf1[i] %= M;
    int sm = 0;
    repin{
    	if(v[i] == 0){
    		int res = (pf1.back() - pf1[i] - (((n-1-i)*pf[i])%M))%M;
    		res *= (i+1);res %= M;
    		int res1 = (sf1[0] - sf1[i] - ((i*sf[i])%M))%M;
    		res1 *= (n-i);res1 %= M;
    		if(i == n-1)res = 0;
    		if(i == 0)res1 = 0;
    		ans += res;ans %= M;
    		ans += res1;ans %= M;
    	}
    	else{
    		int res = (sm*(n-i))%M;
    		ans += res;
    		ans %= M;
    		sm += i+1;
    		sm %= M;
    	}
    }
    ans += M;ans %= M;
    cout << ans << "\n";

}

signed main(){
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    cout.tie(0);
    #ifdef NCR
        init();
    #endif
    #ifdef SIEVE
        sieve();
    #endif
    int t;
    // cin >> t;
    t = inp.readInt(1,100'000);
    inp.readEoln();
    while(t--)
        solve();
    inp.readEof();
    assert(smn <= 200'000);
    return 0;
}
Editorialist's code (Python)
mod = 10**9 + 7
for _ in range(int(input())):
    n = int(input())
    s = input()
    ans = x = sm = 0
    for i in range(1, n):
        if s[i] == s[i-1]:
            x = (x + sm) % mod
        else:
            x = (x + i*(i-1)//2) % mod
            sm = (sm + i) % mod
        ans = (ans + x) % mod
    print(ans)

1 Like

Hi , I know this is my mistake but i have a small suggestion

Please for problem like easy version hard version
Keep the testcase either same or completely different

Scene Happen with me:

I solved the easy first and move to harder and coded that one also , i look the testcase and pretty same if anyone at my place it look exact same
so I run on it, but eventually, according to the easy version test case : the last output is 14, and the actual output of test case of the hard version is 15 .

i thought my code is buggy and not able to find that i am on easy version testcase :rage: :rage:
Also why easy version , hard version test case output is also so similar

Its a request for next time either full same or full different

Another way to solve it is to write k (N-1-k)+ \left(0 + 1 + 2 + \ldots + k-1 \right) = k(N-1-k) + \frac{k(k - 1)}{2} = k (N - 3/2 - k/2). And use it to find the contribution of each border (an index i such that S_{i-1} \ne S_i). You’ll need to maintain the sum of lengths of all substrings containing said border, and also the sum of k, which is easy to update similarly. A similar technique works for all polynomials of small degrees. Submission: 1084306946.

2 Likes

Here’s an alternate explanation :

Club all the identical characters into one group. Call the initial number of elements as n and the initial number of groups as g. For example, T = 000 11 0 1 00 has n = 9 and g = 5 groups.

If T* = T + 0. Then, f(T*) = f(T) + g - 1

If T* = T + 1, Then, f(T*) = f(T) + n - 1.

Let’s solve the easy version using this formula.

Code

For the hard version, let’s define 3 DP.

fsum[i] is the sum of all f for all subtrings ending at i.
lsum[i] is the sum of length of all substrings ending at i.
gsum[i] is the sum of g of all substrings ending at i.

How to compute lsum[i]? Note that a substring ending at i must have been created by extending the substring ending at i - 1. However, the length of all substrings ending at i - 1 will have to be increased by 1. Plus, you can start a new substsring of length 1 right here.

lsum[i] = 1 + lsum[i - 1] + i

How to compute gsum[i]? If the number of groups does not change after appending a new character, you can simply take contribution from gsum[i - 1] or start a new substring of length 1 right here.

gsum[i] = 1 + gsum[i - 1]

But if it does change, then all the g values ending at i - 1 needs to be increased by 1. How many are there? It’s equal to the number of old substrings.

gsum[i] = 1 + gsum[i - 1] + i

How to compute fsum[i]. We already have the formula for the transition of one single f earlier. Therefore, If a new group has been introduced.

fsum[i] = fsum[i - 1] + lsum[i - 1] - i

And if no new group has been created.

fsum[i] = fsum[i - 1] + gsum[i - 1] - i

Code

1 Like

Can you explain in detail?