You are not logged in. Please login at www.codechef.com to post your questions!

×

ASTRING - Editorial

PROBLEM LINK

Practice
Contest

Contributors

Author: Amit Pandey
Tester & Editorialist: Prateek Gupta

DIFFICULTY:

Easy

PREREQUISITES:

Sets, Implementation

PROBLEM:

Find the smallest lexicographical subsequence of length $K$ from a given string of length $N$

EXPLANATION


Solution for Subtask 1:
The smallest subtask can indeed be solved by dynamic programming. Basically, you have a choice at every position of the string $S$ i.e, whether to include a character at particular position in final subsequence of $K$ characters or not. Now, let's define a DP state as $dp(i,\ j)$ denoting the smallest lexicographical subsequence of length $j$ formed from first $i$ characters of the original string $S$. Hence, the following equations hold true. $$dp(0,\ 0)\ =\ null$$ $$dp(0,\ j)\ =\ INF\quad \forall\quad j\ \in\ (1,\ N)$$ $$dp(i,\ j)\ =\ min(\ dp(i\ -\ 1,\ j)\ ,\ s_i + dp(i\ -\ 1,\ j\ -\ 1)$$


Let's have a look at the pseudo code for the bottom up solution of the above approach.

    dp[0][0] = ""
    for ( j = 1 to K ) dp[0][j] = "INF"

    for ( i = 1 to N )  {
        for ( j = 1 to K  ) { 
            dp[i][j] = "INF"
            if ( dp[i - 1][j] != "INF" ) dp[i][j] = dp[i - 1][j]
            if ( dp[i - 1][j - 1] != "INF" )  {
                if ( dp[i][j] == "INF" ) dp[i][j] = dp[i - 1][j - 1] + s[i - 1]
                else dp[i][j] = min(dp[i][j], dp[i - 1][j - 1] + s[i - 1])
            }
        }
    }

    print(dp[N][K])


The same thing can be implemented by recursive memoization solution. You might like to read this blog post to know more about Dynamic Programming. The time complexity for the above approach is $\mathcal{O}(N^{2})$ but the space complexity is around $\mathcal{O}(N^{3})$. Due to the higher space complexity, it won't be possible to implement the same solution for subtask 2 as it can cause memory overflow leading to Runtime Error.


Solution for subtask 2:
The approach for this subtask is based on the fact that the subsequence formed at last has to be of length $K$ exactly. Having said that, it is not difficult to realize that the first character of the smallest lexicographical subsequence has to be from substring $s[0,\ n\ -\ k]$. Why? If the first character is not from this substring, then it would not be possible to form a subsequence of length $K$. Similarly, the $i^{th}$ character should be from the substring $s[prev\ +\ 1,\ n\ -\ k\ +\ i]$, where $prev$ is the position found for $(i - 1)^{th}$ character of the subsequence. Hence, we should always choose the smallest characters in the substrings. And in case, there are two smallest characters, we should choose the one with leftmost position, since it will give a larger substring to choose from, for the rest of the characters in the subsequence. Let us look at the pseudo code for this greedy approach.

    subseq = ""
    prev_pos = -1
    for ( i from 0 to k - 1 ) {
        for ( j from last_pos + 1 to n - k + i ) {
            if ( j == last_pos + 1 ) smallest_char = s[j], leftmost_pos = j
            else if ( smallest_char > s[j] ) smallest_char = s[j], leftmost_pos = j
        }
        prev_pos = leftmost_pos
        subseq.append(smallest_char)
    }

    print(subseq)

Since, for all $K$ characters of the subsequence, it traverses the whole string in the worst case to find the smallest character, the time complexity for this pseudo code is $\mathcal{O}(N*K)$. But, this does not yet suffice to pass all the input files.


Solution for subtask 3:
The logical approach for the problem remains the same as mentioned in the subtask $2$. But, the implementation can be optimized to reduce the overall time complexity. For this, we can maintain a data structure that will give the smallest character in a range and in case, there are several smaller characters, it can give us the position of the leftmost smallest character in that range. The data structure which seems most easier ti implement at this stage is a $set$ of pairs. Hence, the pairs of (characters, indexes) will be inserted into the set when required and removed when they go outside the current range. The first element of the set will give us the required character and it's corresponding index for that particular position. Let's denote $pos[x]$ as the index where the $x^{th}$ character of the $K$-length subsequence was found.

To calculate the $pos[i]$, the range that to be considered will be $(pos[i\ -\ 1]\ +\ 1,\ n\ -\ k\ +\ i)$. Hence, the pairs within the range $(pos[i\ -\ 2]\ +\ 1,\ pos[i\ -\ 1])$ are removed and a new pair of $(s_{n\ -\ k\ +\ i},\ n\ -\ k\ +\ i)$ should be inserted into the set. This would allow you to remove and insert each pair exactly once. Let us now have a look at the pseudo code for the same.

    for ( i = 0 to n - k ) Set.insert(make_pair(s[i], i))

    subseq.append(Set.top().character)

    a = 0, b = Set.top().index
    for (i = n - k + 1 to n - 1 ) {
        while ( a <= b )  Set.erase(make_pair(s[a], a)), a++
        Set.insert(make_pair(s[i], i))
        b = Set.top().second
        subseq.append(Set.top().character)
    }

    print(subseq)

For more details on the implementation of any subtasks, have a look at the tester's solution.

COMPLEXITY

The overall complexity necessary to pass all the input files should be $\mathcal{O}(N*logN)$.

SOLUTIONS

Setter
Tester's solution to Subtask 1
Tester's solution to Subtask 2
Tester's solution to Subtask 3

asked 24 May '16, 17:52

prateekg603's gravatar image

5★prateekg603
20421223
accept rate: 0%

edited 28 May '16, 22:38

admin's gravatar image

0★admin ♦♦
19.7k350498541


https://www.codechef.com/viewsolution/10206440

Here is what I did. Just searched for the best possible element of the substring from left to right.

link

answered 29 May '16, 02:00

debjitdj's gravatar image

4★debjitdj
46519
accept rate: 31%

complexity?

(29 May '16, 18:08) geek_geek4★

It is actually $O( K*(N-K+1) )$ which is approximately $O(N)$

(30 May '16, 04:57) debjitdj4★

no its not O(N)

Consider case when : K = N / 2 It goes O(N * N)

(30 May '16, 19:05) code_hard1236★

Sorry, that's a typo...I meant to say $O(N^2)$

(30 May '16, 20:26) debjitdj4★

You have to update $prev$ correctly. I have edited (only one small change) your code, check it here-

https://www.codechef.com/viewsolution/10252513

(31 May '16, 17:11) debjitdj4★
showing 5 of 6 show all

We can also use a segment tree to find the position and smallest character in the string in the required range

Code in C++ : https://www.codechef.com/viewsolution/10207142

link

answered 28 May '16, 22:37

geek_geek's gravatar image

4★geek_geek
43914
accept rate: 16%

2

Yes, but that would be too much for this problem. :P

(28 May '16, 22:38) amitpandeykgp4★

https://www.codechef.com/viewsolution/10208562

my this soln passed... i thought of using segment tree had it failed....

(28 May '16, 22:40) anupam_datta4★

Even the O(N*K) algorithm with pruning worked for all test cases. Code in C : https://www.codechef.com/viewsolution/10209119

link

answered 28 May '16, 23:18

shubham_43's gravatar image

4★shubham_43
11
accept rate: 0%

(30 May '16, 20:35) c1_66★

We can do it by using stack overall complexity O(n).My Solution link : https://www.codechef.com/viewsolution/10214056

link
This answer is marked "community wiki".

answered 29 May '16, 00:51

gannu_raj's gravatar image

2★gannu_raj
1
accept rate: 0%

edited 29 May '16, 00:55

I did it by greedy implementation with worst case complexity O(N*26).

Here is the link to my solution: https://www.codechef.com/viewsolution/10207247

link

answered 28 May '16, 22:44

sk1sk09_3134's gravatar image

3★sk1sk09_3134
31
accept rate: 0%

I also first implemented using segment tree but got TLE. Then I finally got AC using sparse table with O(NlogN) space and O(1) per query. Click here to see the solution

link

answered 28 May '16, 22:47

rachitjain's gravatar image

5★rachitjain
1307
accept rate: 0%

edited 28 May '16, 22:49

Can anyone explain what's wrong with my code or any tricky test case? my code

link

answered 28 May '16, 23:29

anmol137dh08's gravatar image

6★anmol137dh08
32019
accept rate: 14%

"[...] The time complexity for the above approach is O(N^2) but the space complexity is around O(N^3). [...]"

I do not understand this statement, how can the space complexity ever be bigger than the time complexity, how would you generate O(n) memory in O(1) time ?

link

answered 29 May '16, 01:32

peluche's gravatar image

5★peluche
1
accept rate: 0%

Since, you are storing a string at each dp[i][j], basically you are storing another array of characters in a 2D array. Hence, the space complexity is O(N^3).

(29 May '16, 09:23) prateekg6035★

I too solved it using segment trees. First idea that came to my mind was using RMQ. Here's my solution https://www.codechef.com/viewsolution/10243834

link

answered 30 May '16, 23:45

stellar97's gravatar image

3★stellar97
1
accept rate: 0%

int main() {char a[100005],b[100005]; int t,i,m; scanf("%d",&t); for(m=0;m<t;m++) {int="" k,len;="" scanf("%s",a);="" scanf("%d",&k);="" len="strlen(a);" int="" p="len-k;" int="" j="0;" for(i="0;i&lt;len;i++)" {="" while(j!="0" &&="" b[j-1]="">a[i] && p!=0) { j--; p--; } b[j]=a[i]; j++; } b[k]='\0'; printf("%s\n",b);

}
return 0;
}
link

answered 01 Feb '18, 11:04

sreeram14's gravatar image

2★sreeram14
1
accept rate: 0%

I just sorted the string and took the substring from 0 to k. But it shows wrong answer. Can someone please highlight the mistake in my code.

int main() {

int t;
cin>>t;
while(t--)
{
    string s;
    cin>>s;
    int k;
    cin>>k;
    sort(s.begin(),s.end());
    string s1=s.substr(0,k);
    cout<<s1<<endl;

}

}

link

answered 03 Sep '18, 15:43

adityamittal25's gravatar image

2★adityamittal25
1
accept rate: 0%

toggle preview
Preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported
  • mathemetical formulas in Latex between $ symbol

Question tags:

×15,482
×815
×126

question asked: 24 May '16, 17:52

question was seen: 4,302 times

last updated: 03 Sep '18, 15:43