P4_172 - Editorial

iceknight1093 · February 5, 2025, 4:30pm

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: maximus11
Tester: kingmessi
Editorialist: iceknight1093

DIFFICULTY:

Simple

PREREQUISITES:

None

PROBLEM:

You have two strings A and B. In one move you can delete A_i from A, for a cost of i.
Find the minimum cost to convert A into B.

EXPLANATION:

Our only operation is to delete characters from A, so in the end what we’ll be left with is a subsequence of A.
This means if B is not a subsequence of A, it’s impossible to obtain B; and otherwise it always is.

Checking whether B is a subsequence of A is a well-known problem, and can be done greedily in linear time: iterate through A left-to-right, and try to match the current character of A with the leftmost unmatched character in B.
If all characters of B have been matched, it appears as a subsequence; otherwise it does not.

Now, when B does appear as a subsequence of A, we need to find the minimum cost possible.
That involves two decisions:

We need to decide which subsequence of A will become B, i.e. which characters to delete and which to keep.
Once we’ve decided which characters to delete, we need to decide on the order in which they are deleted.

Let’s tackle the second issue first.
Suppose we want to delete the characters at indices i_1, i_2, \ldots, i_k, where i_1 \lt i_2 \lt \cdots \lt i_k.
Then, it’s not hard to see that for minimal cost, it’s best to delete them in ascending order.
This is because deleting i_1 first reduces the cost of deleting each other index by 1; whereas deleting anything else will not affect the cost of i_1 at all - so it’s optimal to delete i_1 first. Keep applying this argument to the rest of the sequence and we see that deleting in order i_1, i_2, \ldots, i_k is best.

Now, we need to figure out which subsequence of A that equals B, to keep.
Intuitively, deleting early characters of A is cheaper, so it makes sense to keep a ‘later’ occurrence of B in A.
This is in fact true: it’s optimal to keep the “last” occurrence of B in A, and delete everything else.

What does this mean?

Recall the algorithm to check whether B is a subsequence of A.
This algorithm finds the “first” occurrence of B in A, as in, each of the indices it chooses is minimal: there’s no way to move any of them further left and still find an occurrence of B as a subsequence.
This follows naturally from that fact that the choice is made greedily: the instant it’s possible to match a character, we do so.

So, note that simply running this algorithm in reverse will give us the “last” occurrence of B in A.
That is, iterate over A from right to left, and try to match the current character of A with the last unmatched character of B.
This will give us the optimal subsequence to keep.

Once the subsequence itself has been computed, as per the previous step, finding the actual cost is not that hard.
Let j_1, j_2, \ldots, j_m be the indices of the subsequence (so that all the other elements must be deleted).
Then, since we’re deleting from left to right,

Everything before j_1 has a deletion cost of 1.
Everything between j_1 and j_2 has a deletion cost of 2.
Everything between j_2 and j_3 has a deletion cost of 3.
\vdots
Everything after j_m has a deletion cost of m+1.

So, once the indices j_1, j_2, \ldots, j_m are known this final cost can be computed easily enough.

TIME COMPLEXITY:

\mathcal{O}(N) per testcase.

CODE:

Author's code (C++)

// __builtin_clz(x): the number of zeros at the beginning of the number 
// __builtin_ctz(x): the number of zeros at the end of the number
// __builtin_popcountll(x): the number of ones in the number
// __builtin_parity(x): the parity (even or odd) of the number of ones
// __builtin_ffs(int) finds the index of the first (most right) set bit
#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
using namespace std;
using namespace __gnu_pbds;
#define ordered_set tree<int, null_type,less<int>, rb_tree_tag,tree_order_statistics_node_update>
#define ll long long
#define nl "\n"
#define pb push_back
#define eb emplace_back
#define Sort(a) sort(a.begin(),a.end())
#define vi vector<int>
#define vll vector<long long>
#define F first
#define S second
#define yes cout<<"YES\n";
#define no cout<<"NO\n";
int main()
{
ios_base::sync_with_stdio(false);
cin.tie(NULL);
    // freopen("input06.txt","r",stdin);
    // freopen("output06.txt","w",stdout);
int T;
cin>>T;
while(T--)
{    
    string a,b; cin>>a>>b;
    if(b.size() > a.size()){cout<<-1<<endl; continue;}
    int ind = b.size()-1;
    set<int>s;
    for(int i = a.size()-1;i>=0;i--)
    {
        if(a[i] == b[ind])
        {
            ind--;
            s.insert(i);
            if(ind == -1) break;
        }
    }
    if(ind != -1) {cout<<-1<<endl; continue;}
    ll ans = 0;
    ll ct = 1;
    for(int i = 0;i<a.size();i++)
    {
        if(s.find(i) == s.end())
        {
            ans += ct;
        }
        else
        {
            ct++;
        }
    }
    cout<<ans<<endl;
}
return 0;
}

Editorialist's code (PyPy3)

for _ in range(int(input())):
    a = input()
    b = input()
    
    n, m = len(a), len(b)
    ans, p = 0, m-1
    for i in reversed(range(n)):
        if p >= 0 and a[i] == b[p]:
            p -= 1
        else: ans += p + 2
    if p >= 0: print(-1)
    else: print(ans)

harsh_pranjal · February 5, 2025, 5:28pm

Need help !!
Getting runtime error in testcase 5. Any suggestions why?

#include <bits/stdc++.h>
using namespace std;

#define em emplace_back
#define ll long long int
#define vi vector<int>
#define all(a) a.begin(), a.end()
#define read(a)       \
    for (auto &i : a) \
    cin >> i
#define write(a)      \
    for (auto &i : a) \
    cout << i

string LCS(const string &str1, const string &str2)
{
    int m = str1.size();
    int n = str2.size();
    vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));

    for (int i = 1; i <= m; i++)
    {
        for (int j = 1; j <= n; j++)
        {
            if (str1[i - 1] == str2[j - 1])
                dp[i][j] = 1 + dp[i - 1][j - 1];
            else
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
        }
    }

    int i = m, j = n;
    string lcs = "";

    while (i > 0 && j > 0)
    {
        if (str1[i - 1] == str2[j - 1])
        {
            lcs += str1[i - 1];
            i--;
            j--;
        }
        else 
        if (dp[i - 1][j] > dp[i][j - 1])
        {
            i--;
        }
        else
        {
            j--;
        }
    }

    reverse(all(lcs));
    return lcs;
}

int solve(string &a, string &b)
{
    // check if possible to convert str1 to str2
    string temp = LCS(a, b);
    if (temp != b)
        return -1;

    // store the index which needed to be deleted
    vi del_arr;
    int n = b.size(), m = a.size();
    int l = n - 1, r = m - 1;

    while (l >= 0 && r >= 0)
    {
        if (a[r] != b[l])
        {
            del_arr.em(r);
            r--;
        }
        else
        {
            l--;
            r--;
        }
    }

    while (r >= 0)
        del_arr.em(r--);

    reverse(all(del_arr));
    int cost = 0, x = 0;

 // calculate cost
    for (auto i : del_arr)
    {
        cost += (i + 1 - x);
        x++;
    }

    return cost;
}

int main()
{
    ios_base::sync_with_stdio(false);
    cin.tie(0);

    int t;
    cin >> t;

    while (t--)
    {
        string a, b;
        cin >> a >> b;

        int ans = solve(a, b);
        cout << ans << endl;
    }

    return 0;
}

ksv_1288 · February 5, 2025, 5:56pm

change int to long datatype this will work fine.

urbanenesshy · February 5, 2025, 10:00pm

We don’t even need to store the indices, we can just mark those matched characters and then delete the earlier ones.

void solve() {
    // Solution for each test case goes in here
    string src, dest;
    cin >> src >> dest;

    int j = dest.size() - 1;

    // mark #
    for (int i = src.size() - 1; j >= 0 and i >= 0; i--) {
        if (dest[j] == src[i]) {
            src[i] = '#';
            j--;
        }
    }

    if (j != -1) {
        out(-1);
    }
    else {
        long long weight = 1, res = 0;
        for (char &c : src) {
            if (c != '#') res += weight;
            else weight++;
        }
        cout << res;
    }
}

beast_is_back · February 6, 2025, 4:20pm

nice one!