PROBLEM LINK:Author: Ankit Srivastava and Uttam Kanodia DIFFICULTY:EasyMedium PREREQUISITES:Basic programming PROBLEM:Given a string containing '(' and ')' only. Find the longest such subsequence of the given string that is nonregular. Amongst all such distinct answers, output the lexicographically $K^{th}$ amongst them. QUICK EXPLANATION:If the given string in not balanced, then the string itself is the largest nonregular(unbalanced) pattern. On the contrary, if the given string is regular(balanced), then all subsequnces having length $(N1)$ will be nonregular and there is a pattern in their lexicographic ordering. EXPLANATION:The problem can be broadly classified into two cases: Case I: Given parenthesis string is not balanced, which can be checked using stack data structure described here. In this case, the largest unbalanced subsequence will be then the given string itself. So if $K=1$, then the answer will be the given string, $1$ otherwise. Case II: Given parenthesis string is the balanced one. Consider that $L$ is the length of the given string, then there will $L$ subsequences of length $(L1)$, and each of them will be unbalanced. This is because, in any subsequence, number of opening and closing parenthesis will not be the same. How to find the number of distinct unbalanced sub sequences? Consider a string $S$ = "(())", what will be number of distinct unbalanced subsequences? it can be easily claimed that if we delete $S[0]$ or $S[1]$, we will get same string i.e. "())". So, it can be seen easily that if we delete any character from a contiguous chunk of the characters, we will get same unbalanced subsequence. It can be formally written as follows: Consider the any contiguous chunk of same characters in the given string, Suppose $S[i] = S[i+1] = S[i+2] =\ ....\ = S[j] =\ '('$ or $')'$. If we delete any character between $i^{th}$ and $j^{th}$ positions (both inclusive), it will produce the same subsequence, because deleting any of them will replace $(ji)+1$ number of same characters by $(ji)$ number of same characters. So number of distinct subsequences of length $(L1)$ will be number of different contiguous chunks of same characters. For example, string $(())((()))$ will have $4$ distinct subsequence of length $9$, and each of them will be unbalanced. How to do lexicographic ordering of those subsequences? Suppose that the string which we generate by deleting a character from the $i^{th}$ chunk (0indexed) is $L_i$. As our given string is balanced, it will consist of one or more opening parentheses and one or more closing parentheses alternatively (starting with opening). So, to produce $L_{2n}$ we need to delete an opening parenthesis, and to produce $L_{2n+1}$, we need to delete a closing parenthesis. Now, let me claim that the lexicographic ordering: $$L_{2i+1} < L_{2j+1}, \hspace{2 mm} L_{2i} > L_{2j} \hspace{2 mm} for \hspace{2 mm} (i < j) $$ $$L_{2i+1} < L_{2j} \hspace{2 mm} \text{for all} \hspace{2 mm}(i,j) $$ So for string $S = ''(())()((()))(())"$, the lexicographic order will be $L_1 < L_3 < L_5 < L_7 < L_6 < L_4 < L_2 < L_0$. To produce lexicographically $3^{rd}$ string we need to produce $L_5$ by deleting any of the bold characters : "(())()((()))(())". Proof: Consider the proof $L_{2i+1} < L_{2j+1},\hspace{2 mm} \forall \hspace{2 mm} (i < j)$. Others can be proved in the same manner. Consider that the $i^{th}$ chunk has $A_i$ number of characters. Now consider the $(A_1 + A_2 + .... A_{2i+1})^{th}$ character in $L_{2i+1}$ and $L_{2j+1}$, it will be ')' in $L_{2j+1}$, as $i < j$, and it will remain the same as in the original string, but it will be '(' in $L_{2i+1}$ and all previous characters will be the same in both of the strings. As opening parenthesis is lexicographically smaller than closing parenthesis, hence $L_{2i+1} < L_{2j+1}$. Solution:Setter's solution can be found here
This question is marked "community wiki".
asked 19 Jun '15, 16:38

@ar56 Given sequence is S="(())()((()))(())", You know that '(' < ')' . So let '(' = 1 ans ')' = 2. Now S = 11 22 1 2 111 222 11 22 L0 = By deleting any character from 0th pair i.e 11 . L1 = By deleting any character from 1st pair i.e 22 . and so on. Therefore L1 will be : 11 2 1 2 111 222 11 22 L3 = 11 22 1 111 222 11 22 L5 = 11 22 1 2 111 22 11 22 L7 = 11 22 1 2 111 222 11 2 and so on . As it become clear that L1<L3<L5<L7 . . . lexicographically largest will be L0 = 1 22 1 2 111 222 11 22 . Yet I want to know @amitpandeykgp how did you come with the scenario . Amazing question . answered 22 Jun '15, 12:17
1
I am not the setter of the problem. :P I'll ask Ankit to reply.
(22 Jun '15, 13:06)
1
Hi, I am glad you found the problem interesting. However, the story behind this problem is not so interesting :P. I proposed a version of this for cakewalk where you had to find only the lexicographically smallest. However, it seemed a little harder than a 1st problem should be yet, easier than 2nd problem :P So, I just modified it to ask for kth subsequence and tried to solve it myself. Needless to say, I was equally amazed upon finding such a beautiful pattern :D
(23 Jun '15, 00:03)

I'm getting NZEC but I can't understand why. I've verified my answers for all possible input strings up to length 12, and thousands of random input strings up to max length. Here is my code, it uses the same method as the official solution. public static String ANKPAREN(char[] s, int k, String str) on line 87 is the method that does the actual work. EDIT: to be clear I don't care if my solution is correct or not, I'm just looking for a way that it can break (NZEC). EDIT 2: I found the problem: when I read the input I have to say: String s = scan.next(); instead of String s = scan.nextLine(); I always use nextLine() in CodeChef competitions, and both cases work on the sample input. This means my answer was correct but failed due to bad input. This is very disappointing. :( answered 22 Jun '15, 00:27

Can any one help me understand why this code (gist link) is giving NZEC? Edit: A slightly modified solution (gist link) to consider bad input gave me AC. answered 22 Jun '15, 00:12
Could you tell me your approach(in short)?
(22 Jun '15, 00:14)
First, I am checking if string is regular or not. If it is regular, Kth subsequence can be obtained by removing ')' from Kth group of ')' (I mean continuous sequence of ')') from left to right. If there are not enough groups of ')' (i.e K is greater) then Kth subsequence can be obtained by removing '(' from (K  no.of ')' groups)th group of '(' from right to left.
(22 Jun '15, 00:21)
1
My solution failed due to bad input, maybe you have the same problem: http://discuss.codechef.com/questions/71904/ankpareneditorial/72019
(22 Jun '15, 01:06)

@code_master01 & @code_note & @amitpandeykgp.: The editorial is great. Thanks for that. But let me ask you one small thing..as you have said there are 2 key ideas 1. one of chunk 2.other of L2i+1<l2j+1,l2i>L2jfor(i<j) L2i+1<L2jfor all(i,j). How did you arrive at this? Can you please tell me the intuitive idea? What ideas did you have before setting this beautiful question/writing the editorial..hope that will be helpful to beginners like me. answered 22 Jun '15, 07:27

What causes NZEC in python I submitted a code in python , i tested in Codechef IDE which gave appropriate output when i submitted the same for the problem it gave me an NZEC When i just wrote the same algo code in C++ 4.92 it gave me AC Here is the link answered 22 Jun '15, 13:19

Can someone help me understand why this solution http://ideone.com/4PRXeG gave me WA. It works perfectly for the example given above and for any other test cases i could think off using the same logic as described above. answered 22 Jun '15, 13:38

Can anybody please help me with my code that why it's giving TLE . Here's a brief about what i have done. CHECK 1 : if the input string is odd sized then if k>1 print 1 , else print string. CHECK 2 : if the input string is irregular then if k>1 print 1 else print string. CHECK 3 : i have converted the whole string from '(' and ')' to characters 'a' and 'b'. Then using the string concatenation operation and with the help of two temporary strings generated all n1 strings and hashed them into an map. then finally checked if k>map.size() print 1 else find kth position in map and print that string after converting it back. since i have used map . complexity would hardly go nlogn . and for 10^5 and at most 10 test cases . it would go 2 * (10^7) which goes strictly in 1 second. so why tle .? here's my code . answered 22 Jun '15, 14:19
That's because adding strings is not the same thing as adding integers! In line 59, you have a string final with 0 length. Then you reach line 63, where it has length n  1. Notice that string concatenation takes time: O(number_of_characters_added). So, you have are doing n  1 operations inside a loop that runs atleast n times. Thus, you have quadratic complexity and get TLE.
(26 Jun '15, 16:25)

Can anyone tell me for which test case this solution http://www.codechef.com/viewsolution/7267574 is not working .? answered 22 Jun '15, 14:27

can anyone pls tell why this solution is wrong..pls http://www.codechef.com/viewsolution/7264909 answered 22 Jun '15, 18:23

nice editorial! :)
The links to the solutions do not work [ Access Denied ]. :(