MINOPS - Editorial

PROBLEM LINK:

Contest

Author: Ayush Ranjan
Tester: Raja Vardhan Reddy
Editorialist: Rajarshi Basu

DIFFICULTY:

Easy

PREREQUISITES:

Sorting, Implementation


PROBLEM:

We are given two strings S and R of length N each (1 \leq N \leq 10^6). Our goal is to make them equal by doing 0 or more operations of the following type:

  • Choose two integers a and b such that 1 \le a \le b \le N.
  • For each i such that a \le i \le b, replace the i-th character of S by the i-th character of R.

Suppose that you make S equal to R by performing this operation k times and the sum of b-a+1 is l. Then, the cost of this process is defined as k \cdot l.
Find the minimum cost to make S equal to R.


QUICK EXPLANATION:

Take minLen = number of unequal positions. Take k= number of separate “islands” of unequal items. Find the gaps between the indices where S[i] = R[i] and sort them in ascending order. Iterate over them, keep adding them to minLen and subtracting 1 from k, and update ans = min(ans,minLen*k).



DETAILED EXPLANATION:

I will elaborate the solution in terms of step-by-step observations.

PS: To clarify some terms, when I say “islands” below, you can think of them as intervals/ranges/substrings.

Observation 1

N can go up to 10^6, hence we are looking for a O(n) or O(nlogn) solution probably.

Observation 2

What is the minimum length of l needed? Well, at least the number of j s.t S[j] \neq R[j]. What is k in this case? The number of “islands” of unequal elements.
What do I mean? Consider the following:
abbabab
aaaaaaa
Here, the “islands” of unequal places are [2,3], [5,5], [7,7]. In other words, the number of islands is basically the minimum number of substrings needed. Note that when we do an operation as specified in the problem, we are effectively selecting a substring and converting all of its characters.

Observation 3

When we consider the minimum possible l, we are automatically considering the largest value of k. Again, if k were to be 1 (i:e, its minimum value), then l would have had to be the largest value. More precisely, then l = max_j - min_j +1 where max_j and min_j are the maximum and minimum values of j for which S[j] \neq R[j].
Hence, as we increase k, l decreases, and vice versa.

Observation 4

Let’s say initially k is maximum possible, and l is minimum possible. Now, let’s say we want to decrease k by one. That implies, we can no longer consider just the unequal positions, and will have to include some positions j where S[j] = R[j]. The best way to reduce the number of “islands” or separate substrings, is to effectively “bridge the gap” between two substrings. Here, “bridging the gap” essentially means considering the characters between the two islands as well, and then considering the whole substring as a single “island”. Another thing to note is that there are only k-1 gaps - it onty makes sense to “bridge the gap” between consecutive “islands” (or intervals).

For example, Let’s say our “islands” (or intervals) were currently [2,3],[5,5],[8,8]. Obviously, k = number of “islands” = 3. We want to reduce it to 2. Now, we can either “bridge the gap” between [2,3] and [5,5], in which case we will have to consider an extra of 1 cell, ie, \{4\}, or try to “bridge the gap” between [5,5] and [8,8], effectively considering 2 extra cells: \{6,7\}. Obviously, the first case is better since l only increases by 1. Then, our set of “islands” become [2,5], [8,8].

Observation 5

From the above observation, it seems obvious that in every step, we should bridge the smallest gap that is available between the “islands” (or intervals).

Observation 6

If we compute all possible values of the “gaps”, then we are done! Once we select a “gap” and “bridge it”, neither do any more “gaps” get created, nor do we create any new ones. Further, every time we include the smallest gap (let us say of size X) to join two consecutive “islands” (or interval) to form a bigger one, l increases by X exactly.

Observation 7

If we were to decrease k by y, we are effectively bridging y of the smallest gaps. In other words, we are adding to l the values of the smallest y gaps.

Observation 8

Do k-1 iterations. In each iteration, decrease k by one, and add the smallest remaining gap to l, and update ans = min(ans,l*k)

Implementation 1

We can figure out all the gaps in a single pass through the string. For the full process, see Editorialist’s code, its simple!


Time Complexity

We need to find the gaps by a linear pass through the string in O(n) time, and then sort the values, which will take another O(nlogn) time. Seems like Observation 1 was correct after all. :smile:

QUICK REMINDERS:

Don’t keep debug print statements while submitting :stuck_out_tongue:


SOLUTIONS:

Setter's Solution
#include<bits/stdc++.h>
using namespace std;
 
#define ll long long
 
int main(){
	ios_base::sync_with_stdio(false);cin.tie(0);cout.tie(0);
	int t;
	cin >> t;
	while(t--){
		string s, t;
		cin >> s >> t;
		int n = s.size();
		vector<int> lengths;
		int curlen = 0, fl = 0, matlen = 0;
		ll ans = 0;
		for(int i = 0; i < n; i++){
			if(s[i] != t[i]){
				if(curlen > 0 && fl > 0)
					lengths.push_back(curlen), ans += curlen;
				curlen = 0;
				fl = 1;
				ans++;
				continue;
			}
			++curlen;
		}
		n = ans;
		sort(lengths.begin(), lengths.end(), greater<int>());
		for(ll i = 0; i < lengths.size(); i++){
			n -= lengths[i];
			ans = min(ans, (i + 2) * n);
		}
		cout << ans << "\n";
	}
} 
Tester's Solution
//raja1999
 
//#pragma comment(linker, "/stack:200000000")
//#pragma GCC optimize("Ofast")
//#pragma GCC target("sse,sse2,sse3,ssse3,sse4,avx,avx2")
 
#include <bits/stdc++.h>
#include <vector>
#include <set>
#include <map>
#include <string> 
#include <cstdio>
#include <cstdlib>
#include <climits>
#include <utility>
#include <algorithm>
#include <cmath>
#include <queue>
#include <stack>
#include <iomanip> 
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp> 
//setbase - cout << setbase (16)a; cout << 100 << endl; Prints 64
//setfill -   cout << setfill ('x') << setw (5); cout << 77 <<endl;prints xxx77
//setprecision - cout << setprecision (14) << f << endl; Prints x.xxxx
//cout.precision(x)  cout<<fixed<<val;  // prints x digits after decimal in val
 
using namespace std;
using namespace __gnu_pbds;
#define f(i,a,b) for(i=a;i<b;i++)
#define rep(i,n) f(i,0,n)
#define fd(i,a,b) for(i=a;i>=b;i--)
#define pb push_back
#define mp make_pair
#define vi vector< int >
#define vl vector< ll >
#define ss second
#define ff first
#define ll long long
#define pii pair< int,int >
#define pll pair< ll,ll >
#define sz(a) a.size()
#define inf (1000*1000*1000+5)
#define all(a) a.begin(),a.end()
#define tri pair<int,pii>
#define vii vector<pii>
#define vll vector<pll>
#define viii vector<tri>
#define mod (1000*1000*1000+7)
#define pqueue priority_queue< int >
#define pdqueue priority_queue< int,vi ,greater< int > >
#define int ll
 
typedef tree<
int,
null_type,
less<int>,
rb_tree_tag,
tree_order_statistics_node_update>
ordered_set;
 
 
//std::ios::sync_with_stdio(false);
 
main(){
	std::ios::sync_with_stdio(false); cin.tie(NULL);
	int t;
	cin>>t;
	while(t--){
		int id=-1,i,c_eq,ans,k=0,l=0;
		vector<int> vec;
		string a,b;
		cin>>a>>b;
		assert(a.length()==b.length());
		rep(i,a.length()){
			if(a[i]==b[i]){
				continue;
			}
			else{
				id=i;
				break;
			}
		}
		if(id==-1){
			cout<<0<<endl;
		}
		else{
			c_eq=inf;
			f(i,id,a.length()){
				if(a[i]!=b[i]){
					if(c_eq!=0){
						vec.push_back(c_eq);
						c_eq=0;
						k++;
					}
					l++;
				}
				else{
					c_eq++;
				}
			}
			sort(all(vec));
			ans=k*l;
			//cout<<ans<<endl;
			for(i=0;i+1<vec.size();i++){
				k--;
				l+=vec[i];
				ans=min(ans,k*l);
			}
			cout<<ans<<endl;
		}
	}	
	return 0;
} 
	 
Editorialist's Solution
#include <iostream>
#include <vector>
#include <set>
#include <iomanip>
#include <algorithm>
#include <functional>
#include <stdio.h>
#include <cmath>
#include <queue>
#include <string>
#include <map>
#include <complex>
#include <stack>
#include <set>
 
#define FOR(i,n) for(int i=0;i<n;i++)
#define FORE(i,a,b) for(int i=a;i<=b;i++)
#define ll long long int
#define vi vector<int>
#define ii pair<int,int>
#define pb push_back
#define mp make_pair
#define ff first
#define ss second
#define pll pair<ll,ll>
#define cd complex<double> 
#define vv vector
 
using namespace std;
 
const int INF = 1e9;
const int MAXN = 1e3+5;
 
int main(){
	ios_base::sync_with_stdio(0);
	cin.tie(0);
	int t;
	cin >> t;
	while(t--){
		string a;
		string b;
		cin >> a >> b;
		int n = a.size();
 
		int fst = 0;
		vi all;
		ll mn = 0;
		int k = 0;
		bool badstuffGot = 0;
		FOR(i,n){
			if(a[i] == b[i]){
				if(badstuffGot)fst++;
			}else{
				if(!badstuffGot)k++;
				badstuffGot = 1;
				mn++;
				
				if(fst > 0){
					k++;
					all.pb(fst);
					fst = 0;
				}
			}
		}
		sort(all.begin(), all.end());
		//cout << all.size() << " " << k << endl;
		ll ans = k*mn;
		for(auto e : all){
			mn += e;
			k--;
			ans = min(ans,mn*k);
		}
		cout << ans << endl;
	}
 
	return 0;
} 

Please give me suggestions if anything is unclear so that I can improve. Thanks :slight_smile:

77 Likes

I applied the same logic and the code is working fine in gfg ide. But i got runtime error on codechef ide.

this is my code

1 Like

I have done a similar thing but still it gives WA. Please help!!

Make sure you don’t take the 0s from index 0 to first 1 and last 1 to index n-1

k= number of separate “islands” of unequal items. for S=“abc” and R=“def”, islands are { [1,1], [2,2], [3,3] } or [1,3]?

If the strings are S= “dabexbdf” and R=“dcdeybdy” (1-based indexing), this is same as your Observation-4 example, I didn’t get your “bridging gap” concept, here, islands are - [2,3],[5,5],[8,8]. What is the extra 1-cell you are describing? What role is {4} playing between [2,3] and [5,5] and similarly for {6,7}?

Codechef IDE Checks for multiple test cases on the background for the testing correctness of your code. Maybe for some test cases, your code might work but for more good test cases and corner cases, it might fail.
Hope this will answer your query.

line 23.
if vi.size()==0
consider s=> aaa
r=> aaa

1 Like

islands are maximal so it will be [1,3].

the numbers like {4} or {6,7} are basically the bridging elements, ie, those for which S[i] = R[i] but we still include them

I wanted to suggest is that Many times I have seen coders are commenting in to clarify some fuzzy statements posted by some 6* Coders , they simply ignoring and nor providing any details…What kind of mis-management is this???
You should be clarifying , since you have posted some confusing statement in the problem description…Please have a look on this issue!!

2 Likes

Can you provide some edge cases as my code is working fine for all the cases I can think of but still I am getting WA. Code

I have used greedy like solution what is wrong
https://www.codechef.com/viewsolution/32084914

also i tried many cases. Point out one particular case where it will fail.

Alternative approach - Set k = 1 and length as maximum. Keep on taking out the biggest gaps and incrementing k.

1 Like

If constraints would have been for n^2, can we think of a solution with multiple maximum sum subarray?

Nice editorial.

Beautiful Editorial.

1 Like

really nice editorial .

1 Like

What am I doing wrong?

#include <bits/stdc++.h>

using namespace std;

#define ll long long

int func()
{
string s1, s2;
cin >> s1 >> s2;
ll l = 0, k = 0, x = 0, n = s1.length();
ll i;
vector v;
for (i = 0; i < n; i++)
{

    if (s1[i] == s2[i])
        ++x;
    else if (x != 0)
    {   //cout<<" i"<<x<<"i ";
        v.push_back(x);
        x = 0;
    }

    if (s1[i] != s2[i])
        ++l;

    if ((s1[i] != s2[i]) && (s1[i - 1] == s2[i - 1]))
        ++k;
}
//cout<<l<<" "<<k<<"    ";

//cout<<v.size();
if (s1[0] == s2[0] && v.size()!= 0)
    v.erase(v.begin());

sort(v.begin(), v.end());

//cout << l * k << "   ";
//x=0;ll a=0,b=-1;
ll ans =l*k;
for (i = 0; i < v.size(); i++)
{   
    //cout<<(l+v[i])*(--k)<<"*  ";
    ans=min( ans , (l+v[i])*(--k) );
}

//cout<<a<<" "<<b<<"    ";

cout<< ans;

return 0;

}

int main()
{
ios_base::sync_with_stdio(false);
cin.tie(NULL);
ll t, i;
cin >> t;
for (i = 0; i < t; i++)
{
//cout<<“Case #”<<i+1<<": ";
func();
cout << “\n”;
}
}

leave it, I did a silly mistake, found it. :sweat_smile:

For example?

My code is giving wrong answer. Anybody debug this please…Thank you

#include<bits/stdc++.h>
typedef long long ll;
#define MOD 1000000007
#define pb push_back
using namespace std;


int main(){
	ios_base::sync_with_stdio(false);
	cin.tie(NULL);
	cout.tie(NULL);
	ll t;   cin>>t;
	while(t--){
	    string s,r; cin>>s>>r;
	    ll n=s.size();
	    ll minlen=0,k=0;
	    if(s[0]!=r[0]){
	        minlen++;
	        k++;
	    }
	    for(ll i=1;i<n;i++){
	        if(s[i]!=r[i]){
	            minlen++;
	            if(s[i-1]==r[i-1])
	                k++;
	        }
	    }
	    ll ans=minlen*k;
	    vector<ll>gap;
	    ll temp=0;
	    if(s[0]==r[0])
	        temp++;
	    for(ll i=1;i<n;i++){
	        if(s[i]==r[i])
	            temp++;
	       else if(temp>0){
	           gap.pb(temp);
	           temp=0;
	       }
	    }
	    if(temp>0)
	        gap.pb(temp);
	    sort(gap.begin(),gap.end());
	    for(auto it:gap){
	        minlen+=it;
	        k--;
	        ans=min(minlen*k,ans);
	    }
	    cout<<ans<<"\n";
	}
	return 0;
}```