Kudos to problem setter for this nice problem, It teaches me use of suffix array, hashing + binary search + suffix automaton, Quite a lot of stuff =D
I used suffix arrays to solve this question and my solution sustained all retests successfully. The algorithm is:
Construct the suffix array sorted lexicographically.
Construct the lcp array from the suffix array’s consecutive entries.
Mark the entries in lcp array whose suffixes start from different strings.
Find the max length of possible lcs (longest common substring) from the lcp array (equal to the max value of marked entries in lcp array).
Traverse over lcp array again and for all marked entries equal to max value, construct the lcs from the suffix array entries. Insert this lcs into a set. This is a set of possible lcs.
Now traverse over the second string, generate all substrings of length equal to max possible lcs length ( This can be done in O(n) ), search them in the set of possible lcs and break as soon as you find a match. The first match is your answer. This is the lcs which occurs first in the second string.
Link to the solution: CodeChef: Practical coding for everyone
If number of possible lcs are ‘k’, then the complexity of my solution is O(nlgk).
Basically any possible mistakes are avoided because all possible lcs are generated and the one which occurs first in second string is printed always.
Learnt a lot from this question, thanks to @kuruma
Hello @all,
It’s a joy that I see so many talented programmers enjoying all my problems Especially people who are definitely more skilled than I am!!
I have also started to delve a bit into Suffix Arrays with this problem and I hope that with time and practice I can start applying these concepts on other contests where I will be just another contestant
Thanks to everyone for all your enthusiasm and kind words, it means a lot to me,
Bruno Oliveira
@author
I appreciate other approaches mentioned in this editorial but can you provide some tricky cases where the usual Suffix Array solution would fail. I have been continuously getting WA with that approach.
Thanks.
I did something a bit different. What I did was create the generalized suffix tree of s1 and s2, and look for a node with greatest depth that had a leaf from s1 and a leaf from s2 (breaking ties minimizing the index of the node, which is when it was first added). This runs in linear time overall (linear time for creating the suffix tree, then linear time for a DFS). I believe this is the “standard” “textbook” way of doing it with suffix trees, since I remember talking about this in my computational biology class.
Can you give a little details of
int q = st [ p ] . next [ c ] ;
if ( st [ p ] . len + 1 == st [ q ] . len )
st [ cur ] . link = q ;
can any one provide some tricky test case
I used the suffix array, lcp approach. And then searched for the max lcp between consecutive elements in suffix array such that they come from different strings.
I got WA if anyone can help me out where I went wrong, I will be grateful.
Here is my effort
http://www.codechef.com/viewplaintext/3611533
I tried it using Rabin-Karp hash but am not able to get my solution accepted.
So can anybody explain tester’s solution?
What are vis[]
, head[]
, nextcell[]
… etc. used for ??
What do mark
and cnt
denote ??
How do insert()
and check()
work ??
It is very difficult to decipher such a code without a single comment.
How can I get all the test cases for this problem? In general does codechef publish test cases for the challenge problems?
I got AC after considering these two testcases
abcadlfghijkl
abcdlfabcde
Answer is 3 abc and not 3 dlf
similarly ,
abcmdefghijkl
abcdefabcdl
Answer is 3 abc and not 3 def
sa_extend in suffix automaton is diificult to understand.
Please provide a c++ code link for binary search and hashing solution …
I go through this sa_extend() function but i am getting difficulties to understand it.
can anyone please elaborate it in more specific way.
Thanks
can any one please explain me what is happening in sa_extend?
It would be a great help.
any one please explain extendSA() function
Is Suffix Automaton explained in any book/resource in more formal way rather than code ?
I am confused because we don’t have fixed string to create an automaton.
Another solution using Suffix Array can be found here : http://www.roman10.net/2012/03/16/suffix-array-part-3-longest-common-substring-lcs/
How to find LCS given Suffix Automaton? If the first character of the pattern doesn’t exist in the text, how would the state change?
Btw, my solution with Suffix Arrays gets TLE CodeChef: Practical coding for everyone.