×

# CHEFKLCS - Editorial

Author: Misha Chorniy
Tester: Jingbo Shang
Editorialist: Pushkar Mishra

Medium

# PREREQUISITES:

DP, Longest Common Subsequence Algorithm

# PROBLEM:

Given two strings of equal length, we have to report the $k^{th}$ lexicographical longest common subsequence. In other words, let the strings be $S_1$ and $S_2$. Let L be the length of the longest common subsequence. Let P be the set of all the subsequences which are common to $S_1$ and $S_2$ and are of the length L. We have to report the $k^{th}$ element in this set P when it is sorted lexicographically.

# EXPLANATION:

Note: For the sake of simplicity, throughtout the editorial, we assume that $S[1]$ is the first character of the string, i.e., all strings are 1-indexed. The Longest Common Subsequence (LCS) is a very common problem. The traditional approach uses DP. Let $L[i][j]$ be the length of the LCS when considering the last $i$ characters of $S_1$ and $j$ characters of $S_2$. We can formulate the recurrence for $L[i][j]$:

• If $S_1[i] = S_2[j]$ then $L[i][j] = L[i+1][j+1]+1$.
• If $S_1[i] \neq S_2[j]$ then $L[i][j] = max(L[i+1][j], L[i][j+1])$.

Now, let us find a way to count distinct longest common subsequences. Let $D[i][j]$ be the number of LCS of length $L[i][j]$ when considering the last $i$ characters of $S_1$ and $j$ characters of $S_2$, i.e., it gives the number of distinct LCS for substrings $S_1[i..n]$ and $S_2[j..n]$. Now, we formulate the recurrence for this:

• If $S_1[i] = S_2[j]$ then $D[i][j] = D[i+1][j+1]$.
• If $S_1[i] \neq S_2[j]$ and $L[i][j] = L[i+1][j]$, then $D[i][j] = D[i+1][j]$.
• If $S_1[i] \neq S_2[j]$ and $L[i][j] = L[i][j+1]$, then $D[i][j] = D[i][j+1]$.
• If $S_1[i] \neq S_2[j]$ and $L[i+1][j] = L[i][j+1]$, then $D[i][j] = D[i][j+1]+D[i+1][j] - D[i+1][j+1]$.

Now, before we move on to formalizing the algorithm, we have another observation to make:
Let their be two strings $S_1 = abcddpq$ and $S_2 = axbedff$. Consider the LCS in the two strings up till the index 2 (starting index = 1), i.e., $ab$. Now, let us consider how to extend the LCS when taking letters after index 2. We see that we can append a $d$ to make the LCS $abd$. But, $S_1$ has 2 $d$. Which one should be used to extend? In this example, it doesn't matter. But what if $S_2$ was $axbedfd$. We see that it is always favorable to match the earliest appearences of a character in the two strings whenever that is possible. This way, the following appearences of that character can be used to extend the LCS.

Now, we have all the essential elements for the formal algorithm. $L[1][1]$ gives the length of the LCS for the input strings. Set $L[0][0] = L[1][1]+1$. $D[1][1]$ gives the number of distinct LCS of length $L[1][1]$. Set $D[0][0] = 0$. Let us keep two more arrays, pointer_S1[n][26] and pointer_S2[n][26]. Both of them serve the same purpose, but one is for string $S_1$ and the other for string $S_2$. The purpose is that pointer_S1[i][j] stores the index $p$ such that $p > i$ and $S_1[p]$ = $j^{th}$ lower case letter. In other words, pointer_S1[i][0] stores the index $p$ where $p$ is the smallest index greater than $i$ such that $S_1[p] =$ 'a'. If there is no occurrence of the particular letter after index $i$, then $-1$ is stored.

Let us see how we can get the required string from the above structures. If $k > D[1][1]$, we can simply output $-1$ since there is no answer. Otherwise, we start at $L[0][0]$. We want to move to such an $i$, $j$ such that the following conditions are met:

• $L[i][j] = L[0][0] - 1$
• $S_1[i] = S_2[j]$
• The number of strings with letters smaller than $S_1[i]$ in this position is less than the $k$
• The number of strings with letters smaller than AND EQUAL TO $S_1[i]$ in this position is $\geq k$.

If all these conditions are met, we will append the letter $S_1[i]$ ($= S_2[j]$) to our answer.

With the help of the pointer_S1 and pointer_S2 arrays, we can iterate over all possible $i$, $j$ to find the letter we require.

We repeat this till we reach an $i$, $j$ such that $L[i][j] = 1$. A pseudocode of this procedure is provided below:

function get_string()
{
String ans = "";
i, j = 0;
L[0][0] = L[1][1] + 1;

while(L[i][j] > 1)
{
count, new_count = 0;
for( character c = 'a' to 'z')
{
//Getting the next_i and next_j based on pointer arrays.
next_i = pointer_S1[i][c];
next_j = pointer_S2[j][c];

if(next_i or next_j == -1)

if(L[next_i][next_j] != L[i][j] - 1)
letter c doesn't lie on the desired LCS. Thus skip to next letter.

//Producing new_count based on count and the number of distinct LCS between
//substrings S1[next_i..n] and S2[next_j..n].
new_count = count + D[next_i][next_j];

//checking whether last condition is fulfilled or not
if(count < k and k <= new_count)
{
//we have found the letter to extend the desired LCS.
k = k-count;
i = next_i;
j = next_j;
ans = ans + c;

//breaking the loop since letter has been found
break;
}

count = new_count;
}
}

return ans;
}


Other implementation details can be seen from the editorialist's solution.

# COMPLEXITY:

$\mathcal{O}(n^2)$

# SAMPLE SOLUTIONS:

This question is marked "community wiki".

1.3k156581
accept rate: 4%

19.8k350498541

1

If S1[i]=S2[j] then L[i][j]=L[i+1][j+1]+1. Correct it

(23 Dec '15, 04:22) 4★

the editorial is marked "community wiki". this means that if you spot errors, you can correct them directly. please feel free to do so :)

(27 Dec '15, 18:33)

 3 @lohit_97 http://pastie.org/10647199 - my code http://pastie.org/10647207 - tester http://pastie.org/10647204 - editorialist answered 22 Dec '15, 18:49 6★mgch 365●14●35 accept rate: 20%
 1 None of the given sample solutions are visible. Every link shows "Access Denied"! answered 22 Dec '15, 09:18 27●6 accept rate: 0%
 0 Here is my code.. https://www.codechef.com/viewsolution/8997375 I got Wa. please help me where is my wrong. give some critical test case. thanks in advance answered 22 Dec '15, 08:52 3★patience 1 accept rate: 0%
 0 Since, the author, tester and editorialist's codes are not available, can someone give link to someone's code, who has followed the approach stated above? Thanks in advance! answered 22 Dec '15, 10:44 4★lohit_97 342●8 accept rate: 4%
 0 @mgch Thanks a lot! :) answered 28 Dec '15, 10:12 4★lohit_97 342●8 accept rate: 4%
 toggle preview community wiki:
Preview

By Email:

Markdown Basics

• *italic* or _italic_
• **bold** or __bold__
• image?![alt text](/path/img.jpg "title")
• numbered list: 1. Foo 2. Bar
• to add a line break simply add two spaces to where you would like the new line to be.
• basic HTML tags are also supported
• mathemetical formulas in Latex between \$ symbol

Question tags:

×15,678
×2,592
×2,169
×105
×55
×20

question asked: 18 Dec '15, 12:59

question was seen: 3,490 times

last updated: 09 Feb '16, 13:54