SNC - Editorial


AUTHOR: luvk1412


Rabin Karp

You are given a string S and and from that you need to delete every earliest occurrence of any of the given N string B_1, B_2 .. B_N and then print the final string.

This can be done creating the new string character by character and removing the last part of the string if it matches with any of the B_i

Let the answer string be R. Now we scan the string S character by character and keep on pushing it to R, Now if we find any of the suffix of R matching to a string B_i. Note that this removal can be done in O(1) by keeping track of the index of the end of the string and just shifting the end backwards by the length of the suffix which we want to delete.

Now the question is how do we find if any suffix matches on of the B_i. We obviously cannot do it for all the suffixes as it will time out.

We are given that sum of all the |B_i| is less than 10^5. Now due to this constraint only 447 distinct lengths of string are possible for eg: if we take B_1 of length 1, B_2 of length 2, B_3 of length 3 and so on you will notice that sum of lengths of 447 string will exceed 10^5. So you want be able to keep more sitinct lengths and this just for all the smallest lengths possible if increase the lengths distinct count of lengths will decrease only.

So now instead of checking for each suffix if we can somehow check each of the distinct lengths has any matching B_i then the task is done. What we can do is apply Rabin Karp here. We will store hashes of all the B_i separately for each distinct length. Now while constructing the R we will keep a track of hashes for all the distinct lengths of B_i in R. Then after adding each character to R we will just see for all the distinct lengths whether that hash exist int the hashes constructed for B array. The overall complexity will be O(N* 447)

Author’s Solution: Solution