PROBLEM LINK:
Author: Ayush Ranjan
Tester: Raja Vardhan Reddy
Editorialist: Rajarshi Basu
DIFFICULTY:
Easy
PREREQUISITES:
Sorting, Implementation
PROBLEM:
We are given two strings S and R of length N each (1 \leq N \leq 10^6). Our goal is to make them equal by doing 0 or more operations of the following type:
- Choose two integers a and b such that 1 \le a \le b \le N.
- For each i such that a \le i \le b, replace the i-th character of S by the i-th character of R.
Suppose that you make S equal to R by performing this operation k times and the sum of b-a+1 is l. Then, the cost of this process is defined as k \cdot l.
Find the minimum cost to make S equal to R.
QUICK EXPLANATION:
Take minLen = number of unequal positions. Take k= number of separate “islands” of unequal items. Find the gaps between the indices where S[i] = R[i] and sort them in ascending order. Iterate over them, keep adding them to minLen and subtracting 1 from k, and update ans = min(ans,minLen*k).
DETAILED EXPLANATION:
I will elaborate the solution in terms of step-by-step observations.
PS: To clarify some terms, when I say “islands” below, you can think of them as intervals/ranges/substrings.
Observation 1
N can go up to 10^6, hence we are looking for a O(n) or O(nlogn) solution probably.
Observation 2
What is the minimum length of l needed? Well, at least the number of j s.t S[j] \neq R[j]. What is k in this case? The number of “islands” of unequal elements.
What do I mean? Consider the following:
abbabab
aaaaaaa
Here, the “islands” of unequal places are [2,3], [5,5], [7,7]. In other words, the number of islands is basically the minimum number of substrings needed. Note that when we do an operation as specified in the problem, we are effectively selecting a substring and converting all of its characters.
Observation 3
When we consider the minimum possible l, we are automatically considering the largest value of k. Again, if k were to be 1 (i:e, its minimum value), then l would have had to be the largest value. More precisely, then l = max_j - min_j +1 where max_j and min_j are the maximum and minimum values of j for which S[j] \neq R[j].
Hence, as we increase k, l decreases, and vice versa.
Observation 4
Let’s say initially k is maximum possible, and l is minimum possible. Now, let’s say we want to decrease k by one. That implies, we can no longer consider just the unequal positions, and will have to include some positions j where S[j] = R[j]. The best way to reduce the number of “islands” or separate substrings, is to effectively “bridge the gap” between two substrings. Here, “bridging the gap” essentially means considering the characters between the two islands as well, and then considering the whole substring as a single “island”. Another thing to note is that there are only k-1 gaps - it onty makes sense to “bridge the gap” between consecutive “islands” (or intervals).
For example, Let’s say our “islands” (or intervals) were currently [2,3],[5,5],[8,8]. Obviously, k = number of “islands” = 3. We want to reduce it to 2. Now, we can either “bridge the gap” between [2,3] and [5,5], in which case we will have to consider an extra of 1 cell, ie, \{4\}, or try to “bridge the gap” between [5,5] and [8,8], effectively considering 2 extra cells: \{6,7\}. Obviously, the first case is better since l only increases by 1. Then, our set of “islands” become [2,5], [8,8].
Observation 5
From the above observation, it seems obvious that in every step, we should bridge the smallest gap that is available between the “islands” (or intervals).
Observation 6
If we compute all possible values of the “gaps”, then we are done! Once we select a “gap” and “bridge it”, neither do any more “gaps” get created, nor do we create any new ones. Further, every time we include the smallest gap (let us say of size X) to join two consecutive “islands” (or interval) to form a bigger one, l increases by X exactly.
Observation 7
If we were to decrease k by y, we are effectively bridging y of the smallest gaps. In other words, we are adding to l the values of the smallest y gaps.
Observation 8
Do k-1 iterations. In each iteration, decrease k by one, and add the smallest remaining gap to l, and update ans = min(ans,l*k)
Implementation 1
We can figure out all the gaps in a single pass through the string. For the full process, see Editorialist’s code, its simple!
Time Complexity
We need to find the gaps by a linear pass through the string in O(n) time, and then sort the values, which will take another O(nlogn) time. Seems like Observation 1 was correct after all.
QUICK REMINDERS:
Don’t keep debug print statements while submitting
SOLUTIONS:
Setter's Solution
#include<bits/stdc++.h>
using namespace std;
#define ll long long
int main(){
ios_base::sync_with_stdio(false);cin.tie(0);cout.tie(0);
int t;
cin >> t;
while(t--){
string s, t;
cin >> s >> t;
int n = s.size();
vector<int> lengths;
int curlen = 0, fl = 0, matlen = 0;
ll ans = 0;
for(int i = 0; i < n; i++){
if(s[i] != t[i]){
if(curlen > 0 && fl > 0)
lengths.push_back(curlen), ans += curlen;
curlen = 0;
fl = 1;
ans++;
continue;
}
++curlen;
}
n = ans;
sort(lengths.begin(), lengths.end(), greater<int>());
for(ll i = 0; i < lengths.size(); i++){
n -= lengths[i];
ans = min(ans, (i + 2) * n);
}
cout << ans << "\n";
}
}
Tester's Solution
//raja1999
//#pragma comment(linker, "/stack:200000000")
//#pragma GCC optimize("Ofast")
//#pragma GCC target("sse,sse2,sse3,ssse3,sse4,avx,avx2")
#include <bits/stdc++.h>
#include <vector>
#include <set>
#include <map>
#include <string>
#include <cstdio>
#include <cstdlib>
#include <climits>
#include <utility>
#include <algorithm>
#include <cmath>
#include <queue>
#include <stack>
#include <iomanip>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
//setbase - cout << setbase (16)a; cout << 100 << endl; Prints 64
//setfill - cout << setfill ('x') << setw (5); cout << 77 <<endl;prints xxx77
//setprecision - cout << setprecision (14) << f << endl; Prints x.xxxx
//cout.precision(x) cout<<fixed<<val; // prints x digits after decimal in val
using namespace std;
using namespace __gnu_pbds;
#define f(i,a,b) for(i=a;i<b;i++)
#define rep(i,n) f(i,0,n)
#define fd(i,a,b) for(i=a;i>=b;i--)
#define pb push_back
#define mp make_pair
#define vi vector< int >
#define vl vector< ll >
#define ss second
#define ff first
#define ll long long
#define pii pair< int,int >
#define pll pair< ll,ll >
#define sz(a) a.size()
#define inf (1000*1000*1000+5)
#define all(a) a.begin(),a.end()
#define tri pair<int,pii>
#define vii vector<pii>
#define vll vector<pll>
#define viii vector<tri>
#define mod (1000*1000*1000+7)
#define pqueue priority_queue< int >
#define pdqueue priority_queue< int,vi ,greater< int > >
#define int ll
typedef tree<
int,
null_type,
less<int>,
rb_tree_tag,
tree_order_statistics_node_update>
ordered_set;
//std::ios::sync_with_stdio(false);
main(){
std::ios::sync_with_stdio(false); cin.tie(NULL);
int t;
cin>>t;
while(t--){
int id=-1,i,c_eq,ans,k=0,l=0;
vector<int> vec;
string a,b;
cin>>a>>b;
assert(a.length()==b.length());
rep(i,a.length()){
if(a[i]==b[i]){
continue;
}
else{
id=i;
break;
}
}
if(id==-1){
cout<<0<<endl;
}
else{
c_eq=inf;
f(i,id,a.length()){
if(a[i]!=b[i]){
if(c_eq!=0){
vec.push_back(c_eq);
c_eq=0;
k++;
}
l++;
}
else{
c_eq++;
}
}
sort(all(vec));
ans=k*l;
//cout<<ans<<endl;
for(i=0;i+1<vec.size();i++){
k--;
l+=vec[i];
ans=min(ans,k*l);
}
cout<<ans<<endl;
}
}
return 0;
}
Editorialist's Solution
#include <iostream>
#include <vector>
#include <set>
#include <iomanip>
#include <algorithm>
#include <functional>
#include <stdio.h>
#include <cmath>
#include <queue>
#include <string>
#include <map>
#include <complex>
#include <stack>
#include <set>
#define FOR(i,n) for(int i=0;i<n;i++)
#define FORE(i,a,b) for(int i=a;i<=b;i++)
#define ll long long int
#define vi vector<int>
#define ii pair<int,int>
#define pb push_back
#define mp make_pair
#define ff first
#define ss second
#define pll pair<ll,ll>
#define cd complex<double>
#define vv vector
using namespace std;
const int INF = 1e9;
const int MAXN = 1e3+5;
int main(){
ios_base::sync_with_stdio(0);
cin.tie(0);
int t;
cin >> t;
while(t--){
string a;
string b;
cin >> a >> b;
int n = a.size();
int fst = 0;
vi all;
ll mn = 0;
int k = 0;
bool badstuffGot = 0;
FOR(i,n){
if(a[i] == b[i]){
if(badstuffGot)fst++;
}else{
if(!badstuffGot)k++;
badstuffGot = 1;
mn++;
if(fst > 0){
k++;
all.pb(fst);
fst = 0;
}
}
}
sort(all.begin(), all.end());
//cout << all.size() << " " << k << endl;
ll ans = k*mn;
for(auto e : all){
mn += e;
k--;
ans = min(ans,mn*k);
}
cout << ans << endl;
}
return 0;
}
Please give me suggestions if anything is unclear so that I can improve. Thanks