STRMRG - Editorial

melfice · January 14, 2018, 3:25am

PROBLEM LINK:

Practice
Contest

Author: Hasan Jaddouh
Tester: Alexey Zayakin
Editorialist: Oleksandr Kulkov

DIFFICULTY:

EASY

PREREQUISITES:

Dynamic programming

PROBLEM:

You’re given two strings A and B. You have to merge those strings into string C in such a way that amount of valid indices i such that C_i \neq C_{i+1} is minimized.

QUICK EXPLANATION

Use 2 \cdot n \cdot m dynamic programming to mark length of merged prefixes and string from which you took last letter.

EXPLANATION:

This problem is straightforward if you use dynamic programming of following form:

DP[pos1][pos2][lastchar]=\text{answer if you merged prefixes $A_{pos_1}$ and $B_{pos_2}$ and last character was from string $lastchar$}

You can implement it in the following manner:

int sz[2];
cin >> sz[0] >> sz[1];
string a[2];
cin >> a[0] >> a[1];
int dp[sz[0] + 1][sz[1] + 1][2];
memset(dp, 0x3F, sizeof(dp));
dp[1][0][0] = dp[0][1][1] = 1;
int idx[2];
for(idx[0] = 0; idx[0] <= sz[0]; idx[0]++) {
    for(idx[1] = 0; idx[1] <= sz[1]; idx[1]++) {
        for(int pz = 0; pz <= 1; pz++) {
            for(int nz = 0; nz <= 1; nz++) {
                if(idx[nz] < sz[nz] && idx[pz] > 0) {
                    int ndx[2] = {idx[0] + !nz, idx[1] + nz};
                    dp[ndx[0]][ndx[1]][nz] = min(dp[ndx[0]][ndx[1]][nz], 
                        dp[idx[0]][idx[1]][pz] + (a[nz][idx[nz]] != a[pz][idx[pz] - 1]));
                }
            }
        }
    }
}
cout << min(dp[sz[0]][sz[1]][0], dp[sz[0]][sz[1]][1]) << endl;

Note that it should be 2 \cdot 2 \cdot n \cdot m and not 26 \cdot n \cdot m since the last one will not fit into TL.

In the code above we consider that we already merged prefixed A_{idx_1} and B_{idx_2}. Variable pz indicates if we had character from A (in case pz=0) or from B (in case pz=1) on the top before adding new character and nz indicates from which string we added new character (same way as pz). After that new character we can assume that we merged prefixed A_{ndx_1} and B_{ndx_2}. Answer for DP[ndx_1][ndx_2][nz] can be relaxed by answer for DP[idx_1][idx_2][pz] plus one if old top character and new top character do not match.

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.
Tester’s solution can be found here.

RELATED PROBLEMS:

prakhar17252 · January 15, 2018, 8:34pm

This question can also be answered using Longest common subsequence.
The answer will be f(s1) + f(s2) - lcs(s1,s2) where s1 and s2 are the two strings.

madhav_1999 · January 15, 2018, 9:37pm

@prakhar17252 You’ve got the wrong expression. We don’t have to subtract the lcs of the original strings. We have to subtract the lcs of the reduced substrings of the strings where each section is represented by only one instance of the repeating character.

piyush_ravi · January 15, 2018, 9:41pm

I could not find one submission in Python which got the 100 points.[1] All these days I spent my time trying to implement the sub-quadratic algorithm for the LCS.

References:

CodeChef: Practical coding for everyone

vijju123 · January 15, 2018, 9:44pm

I really think that this question should have had the clause-

“Also output the string C, which results after merging string A and String B such that F© is minimized. In case of multiple correct answers, print any.”

gomu · January 15, 2018, 10:25pm

I used 26 instead of 2*2 as short int to pass memory limits, along with some optimisations and got AC.
My solution. There should have been stronger test cases. The only downside was I did 67 submissions.

prashant231997 · January 16, 2018, 1:00am

@piyush_ravi I think this is because of time constraint set for python codes.
As I have submitted a code in python 3.5 CodeChef: Practical coding for everyone which gives me TLE in long test cases but when I submitted the same code in C++ I got AC CodeChef: Practical coding for everyone.

bvsbrk · January 16, 2018, 9:35am

The answer is as simple as this. Just remove all the consecutive duplicates in both the strings like if the string is hello make it as helo. This can be made in O(n) time.

After this find lcs of both modified strings. The answer is length(s1) + length(s2) - lcs

akshatsharma · January 16, 2018, 5:19pm

can somebody tell me how to do bottoms up DP to the solution indicated above?

iprakhar22 · January 16, 2018, 11:12pm

Can someone tell me how did they come-up with the LCS solution? Was there a logic behind this or what is the reason this works?
Thanks

mehta_55 · January 17, 2018, 2:26am

@bvsbrk can you explain how your logic works? why are we compressing strings(removing consecutive duplicates)? and how lcs will lead to the required answer?

worldunique · January 18, 2018, 5:58pm

this question should have been the 4th one not 5th one …
too much imbalance is not good
we get a loss of many medium type questions

p1p5_5 · January 20, 2018, 2:32pm

Hi, Can anyone please tell me why I could get only test cases subtask #2 to pass and not others. I had tried it for 4 continuous days to identify my mistake but in vain. I got really frustrated at one moment of time. It will be really helpful if someone can point me the mistake or the test case for which it fails.The link to my submission is: CodeChef: Practical coding for everyone

Just for explanation, I have used this logic: f(s1)+f(s2)-f(lcs(s1,s2)). So, I calculate the LCS string of the 2 strings and apply the given function “f(C)”. Please help.

harrypotter0 · January 21, 2018, 12:28pm

Anybody got 100 pts in python ?? Please share the link or give it a try .

abhi55 · January 21, 2018, 7:17pm

@p1p5_5 i saw your solution lcs part is correct but you should do that consecutive diffrent character only in string implementation before LCS in o(n) and after that lcs it will give correct answer and also optimize your dp part(LCS) in most cases except case (abcdefghi…wxyz) because no consecutive same character.
and then f(a)+f(b)-lcs(a,b) will give correct answer.
you can check my code link .if any problem persist you can ask

abhi55 · January 21, 2018, 7:39pm

@harrypotter0 you can do this in any programming language by below steps:-

step 1:-process both sting A and String B such that no consecutive similiar character exsist.for
example abbaaccde would be abacde after processing.it will take o(n) time where n is length of string

step 2:-compute LCS of processed strings using Dynamic Programming in o(nm) time where n is length of processed String A and m is length of processed string B.LCS computed is length of longest sequence.

step 3:-answer will be length of processed string A+length of processed string B-LCS

sapfire · January 22, 2018, 3:01am

@vijju123 I really think that adding the clause to output the resulting string wouldn’t have made things much tougher…

Compress the input strings by removing consecutive duplicate letters.
Find the lcs(a, b) and from the expression mentioned compute the minimum value.
Keep track of the next character in lcs(a, b) and erase characters from the input string until the head characters are both equal to the next character in lcs(a, b).

I haven’t coded it up yet but I’m sure that it or a very minor variation of it might work.

viditganpi · January 28, 2018, 1:06pm

can anyone explain what is wrong in my solution CodeChef: Practical coding for everyone

iambatman93 · February 1, 2018, 1:55am

I have done an implemenation of dp on the similar lines of editorial and I can’t seem to think of a case where it fails but it is failing while submitting.

Here is my solution

dp[i][j][k] represents, I have taken i characters from string1 and j characters from string2 and the last character is from stringk

dwij28 · January 15, 2018, 9:29pm

Yup, the large number of submissions for a 5th problem could only mean one thing that there is pretty standard implementation of it available out there.