You are not logged in. Please login at www.codechef.com to post your questions!

×

CHEFKLCS - Editorial

PROBLEM LINK:

Practice
Contest

Author: Misha Chorniy
Tester: Jingbo Shang
Editorialist: Pushkar Mishra

DIFFICULTY:

Medium

PREREQUISITES:

DP, Longest Common Subsequence Algorithm

PROBLEM:

Given two strings of equal length, we have to report the $k^{th}$ lexicographical longest common subsequence. In other words, let the strings be $S_1$ and $S_2$. Let L be the length of the longest common subsequence. Let P be the set of all the subsequences which are common to $S_1$ and $S_2$ and are of the length L. We have to report the $k^{th}$ element in this set P when it is sorted lexicographically.

EXPLANATION:

Note: For the sake of simplicity, throughtout the editorial, we assume that $S[1]$ is the first character of the string, i.e., all strings are 1-indexed. The Longest Common Subsequence (LCS) is a very common problem. The traditional approach uses DP. Let $L[i][j]$ be the length of the LCS when considering the last $i$ characters of $S_1$ and $j$ characters of $S_2$. We can formulate the recurrence for $L[i][j]$:

  • If $S_1[i] = S_2[j]$ then $L[i][j] = L[i+1][j+1]+1$.
  • If $S_1[i] \neq S_2[j]$ then $L[i][j] = max(L[i+1][j], L[i][j+1])$.

Now, let us find a way to count distinct longest common subsequences. Let $D[i][j]$ be the number of LCS of length $L[i][j]$ when considering the last $i$ characters of $S_1$ and $j$ characters of $S_2$, i.e., it gives the number of distinct LCS for substrings $S_1[i..n]$ and $S_2[j..n]$. Now, we formulate the recurrence for this:

  • If $S_1[i] = S_2[j]$ then $D[i][j] = D[i+1][j+1]$.
  • If $S_1[i] \neq S_2[j]$ and $L[i][j] = L[i+1][j]$, then $D[i][j] = D[i+1][j]$.
  • If $S_1[i] \neq S_2[j]$ and $L[i][j] = L[i][j+1]$, then $D[i][j] = D[i][j+1]$.
  • If $S_1[i] \neq S_2[j]$ and $L[i+1][j] = L[i][j+1]$, then $D[i][j] = D[i][j+1]+D[i+1][j] - D[i+1][j+1]$.

Now, before we move on to formalizing the algorithm, we have another observation to make:
Let their be two strings $S_1 = abcddpq$ and $S_2 = axbedff$. Consider the LCS in the two strings up till the index 2 (starting index = 1), i.e., $ab$. Now, let us consider how to extend the LCS when taking letters after index 2. We see that we can append a $d$ to make the LCS $abd$. But, $S_1$ has 2 $d$. Which one should be used to extend? In this example, it doesn't matter. But what if $S_2$ was $axbedfd$. We see that it is always favorable to match the earliest appearences of a character in the two strings whenever that is possible. This way, the following appearences of that character can be used to extend the LCS.

Now, we have all the essential elements for the formal algorithm. $L[1][1]$ gives the length of the LCS for the input strings. Set $L[0][0] = L[1][1]+1$. $D[1][1]$ gives the number of distinct LCS of length $L[1][1]$. Set $D[0][0] = 0$. Let us keep two more arrays, pointer_S1[n][26] and pointer_S2[n][26]. Both of them serve the same purpose, but one is for string $S_1$ and the other for string $S_2$. The purpose is that pointer_S1[i][j] stores the index $p$ such that $p > i$ and $S_1[p]$ = $j^{th}$ lower case letter. In other words, pointer_S1[i][0] stores the index $p$ where $p$ is the smallest index greater than $i$ such that $S_1[p] =$ 'a'. If there is no occurrence of the particular letter after index $i$, then $-1$ is stored.

Let us see how we can get the required string from the above structures. If $k > D[1][1]$, we can simply output $-1$ since there is no answer. Otherwise, we start at $L[0][0]$. We want to move to such an $i$, $j$ such that the following conditions are met:

  • $L[i][j] = L[0][0] - 1$
  • $S_1[i] = S_2[j]$
  • The number of strings with letters smaller than $S_1[i]$ in this position is less than the $k$
  • The number of strings with letters smaller than AND EQUAL TO $S_1[i]$ in this position is $\geq k$.

If all these conditions are met, we will append the letter $S_1[i]$ ($= S_2[j]$) to our answer.

With the help of the pointer_S1 and pointer_S2 arrays, we can iterate over all possible $i$, $j$ to find the letter we require.

We repeat this till we reach an $i$, $j$ such that $L[i][j] = 1$. A pseudocode of this procedure is provided below:

function get_string()
{
    String ans = "";
    i, j = 0;
    L[0][0] = L[1][1] + 1;

    while(L[i][j] > 1)
    {
        count, new_count = 0;
        for( character c = 'a' to 'z')
        {
            //Getting the next_i and next_j based on pointer arrays.
            next_i = pointer_S1[i][c];
            next_j = pointer_S2[j][c];

            if(next_i or next_j == -1)
                skip to next letter;

            if(L[next_i][next_j] != L[i][j] - 1)
                letter c doesn't lie on the desired LCS. Thus skip to next letter.

            //Producing new_count based on count and the number of distinct LCS between
            //substrings S1[next_i..n] and S2[next_j..n].
            new_count = count + D[next_i][next_j];

            //checking whether last condition is fulfilled or not
            if(count < k and k <= new_count)
            {
                //we have found the letter to extend the desired LCS.
                k = k-count;
                i = next_i;
                j = next_j;
                ans = ans + c;

                //breaking the loop since letter has been found
                break;
            }

            count = new_count;
        }
    }

    return ans;
}

Other implementation details can be seen from the editorialist's solution.

COMPLEXITY:

$\mathcal{O}(n^2)$

SAMPLE SOLUTIONS:

Author
Tester
Editorialist

This question is marked "community wiki".

asked 18 Dec '15, 12:59

pushkarmishra's gravatar image

4★pushkarmishra
1.3k156581
accept rate: 4%

edited 09 Feb '16, 13:54

admin's gravatar image

0★admin ♦♦
19.8k350498541

1

If S1[i]=S2[j] then L[i][j]=L[i+1][j+1]+1. Correct it

(23 Dec '15, 04:22) varun14★

the editorial is marked "community wiki". this means that if you spot errors, you can correct them directly. please feel free to do so :)

(27 Dec '15, 18:33) pushkarmishra4★

link

answered 22 Dec '15, 18:49

mgch's gravatar image

6★mgch
3651435
accept rate: 20%

None of the given sample solutions are visible. Every link shows "Access Denied"!

link

answered 22 Dec '15, 09:18

linonymous's gravatar image

2★linonymous
276
accept rate: 0%

edited 23 Dec '15, 19:36

Here is my code.. https://www.codechef.com/viewsolution/8997375 I got Wa. please help me where is my wrong. give some critical test case. thanks in advance

link

answered 22 Dec '15, 08:52

patience's gravatar image

3★patience
1
accept rate: 0%

Since, the author, tester and editorialist's codes are not available, can someone give link to someone's code, who has followed the approach stated above? Thanks in advance!

link

answered 22 Dec '15, 10:44

lohit_97's gravatar image

4★lohit_97
3428
accept rate: 4%

@mgch Thanks a lot! :)

link

answered 28 Dec '15, 10:12

lohit_97's gravatar image

4★lohit_97
3428
accept rate: 4%

toggle preview
Preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported
  • mathemetical formulas in Latex between $ symbol

Question tags:

×15,678
×2,592
×2,169
×105
×55
×20

question asked: 18 Dec '15, 12:59

question was seen: 3,490 times

last updated: 09 Feb '16, 13:54