×

SECPASS - Editorial

Author: Shivam Gor
Tester: Michael Nematollahi
Editorialist: Hussain Kara Fallah

Easy

NONE

PROBLEM:

Given a string $S$ with length up to $10^5$. You have to find a prefix of $S$ that occurs the maximum number of times as a substring in $S$ (among all prefixes). If there's a tie, pick the longest prefix.

EXPLANATION:

Clearly, the prefix consisting of the first letter alone occurs the maximum number of times among all prefixes. Because whatever is the answer prefix, the first letter is always a prefix of it. However, it's not necessarily the longest one.

To find the longest one, let's maintain a list of all appearances of our string's first letter. Let's say the prefix consisting of the first 2 letters has maximum occurrences as well, what condition should hold?

Answer: Every appearance of the string's first letter must be followed immediately by the string's second letter. If one of these appearances was followed by a different letter, then clearly the prefix consisting of 2 letters occurs fewer times. So what are the condition for lengths > 2? (think about it).

If our answer's length is $K$ then all appearances of the first letter must be followed by exactly the same $K-1$ letters. So after we maintain the list of appearances indices, we keep moving them forward simultaneously one letter at each step (scanning substrings which present a prefix) until some mismatch happens or the last index reaches the end of the string.

This solution runs in $O(|S|)$

We can prove that according to the fact, that our segments built by moving our pointers cannot ever intersect. If 2 of them intersected, it means that at least one of our pointers points to some index with corresponding letter equal to the string's first letter. BUT when we started a substring from the last appearance of our string's first letter, there's no other appearance in front of it, so it's guaranteed that there will be a mismatch (or it reached the end of the string).

AUTHOR'S AND TESTER'S SOLUTIONS:

TESTER's solution

This question is marked "community wiki".

1181234
accept rate: 0%

19.8k350498541

1
(18 Feb, 03:16)

 3 I guess, Z-function was quite good for solving this problem https://www.codechef.com/viewsolution/23124045 answered 18 Feb, 04:06 34●1 accept rate: 0%
 0 Am I the only one who observed the problem poorly and overkilled it by using binary search and string hashing ? Here is the solution link : https://www.codechef.com/viewsolution/23134615 answered 18 Feb, 02:52 4★cormen_1 1●1 accept rate: 0% I did the same. (18 Feb, 03:12) sharepo5★ Sad, I got TLE for Hashing+Binary Search :( (18 Feb, 16:20)
 0 I solved it using binary search over length of prefix with KMP algorithm for pattern searching... O(|s|*log(|s|)). Solution Link answered 18 Feb, 07:44 1.6k●2●11 accept rate: 24% shouldn't it be O(|s| * log(|s|)) ? (18 Feb, 10:05) Yes it should be... Typo... Thanks.. (18 Feb, 10:35) O(|S|*log(|S|)) * T must be tight for the time limit. (18 Feb, 11:25) shoom6★ True @shoom... (18 Feb, 13:29)
 0 I used KMP search algorithm and binary search and hence the overall complexity was somewhat O((N+N+M)logN) or rather O((N+M)logN), multiplied the number of test cases. Can anyone say why is got itself TLEd? Or do I have no clue of the complexity? https://www.codechef.com/viewsolution/23133640 answered 18 Feb, 07:54 1 accept rate: 0% Maybe... 1) unordered_map 2) use s.substr instead of pushing every element... (18 Feb, 08:12) We have exactly same solution XD (18 Feb, 08:16) I replaced my function of KMP algorithm for pattern searching with yours and it got accepted! https://www.codechef.com/viewsolution/23141256 Boy I want to get my hands dirty with your resources XD. (18 Feb, 08:41) XDD how !! I thought we both copied it from gfg... (18 Feb, 08:49) I have size of LPS=M and you have NN (18 Feb, 08:52)
 0 can someone explain me what is "Z-function" ? And how it's related to this problem answered 18 Feb, 09:15 3★nipul1 1●1 accept rate: 0% Try to search it on google (18 Feb, 10:36)
 0 "We can prove that according to the fact, that our segments built by moving our pointers cannot ever intersect. If 2 of them intersected, it means that at least one of our pointers points to some index with corresponding letter equal to the string's first letter" Can someone please explain this part a bit more? (with an example maybe) answered 18 Feb, 11:48 2 accept rate: 0% Basically there is no point to consider prefixes where the first letter appears more than once. Such prefix occurs fewer times that a single-letter prefix, so it can't be a candidate for the answer. For example the prefix 'abca' appears fewer times than the prefix 'a'. (19 Feb, 10:06) shoom6★
 0 Can someone please help me find fault in my solution, I think I did the same thing as the editorial. Still got a tle. https://www.codechef.com/viewsolution/23132990 I am returning to CP after a while so might be because my coding hands are rusty. answered 18 Feb, 11:52 163●1●6 accept rate: 10% I think, I had to just break the loop in case of a mismatch, rather just removing those occurances (18 Feb, 12:01)
 0 Test cases are very week: O(n^2): submission, O(n): submission Worst case for O(n^2) solution is when all chars are same. :P answered 18 Feb, 16:52 115●4 accept rate: 14%
 0 I have studied 'Z'-Algorithm from various sources . But to best of my knowledge Z function is O(m+n) algorithm so how can we solve the problem in less than O(n^2) time . Can someone elaborate Correct logic using Z Algorithm answered 18 Feb, 17:09 3★nipul1 1●1 accept rate: 0% In Z algo you create a Z array where each index stores the largest substring starting at i and is also a prefix Now cosider the string "abcabcabc" Here the z array looks like this : 0 0 0 6 0 0 3 0 0 In the above array 6 denotes the substring "abcabc" which is also the prefix and 3 denotes the substring "abc" which is also the prefix Now if you look carefully, in the substring you "abcabc" you are getting one time occurence of the prefixes : "a", "ab", "abc","abca","abcab","abcabc". Similarly in the case of "abc". (20 Feb, 16:45) So you can keep an array where each index i denotes the occurence of the prefix of length i. So for each non-zero element(say z[i]) in the z-array you update all the elements between the range 1 to z[i] of the frequency array index by adding 1 to them. This will give you an array containing the frequency of all the prefixes. Then you check for the element with largest frequency and for a tie, check for the largest element (20 Feb, 16:46) This is my solution in pypy3 : https://www.codechef.com/viewsolution/23144894 (20 Feb, 16:49)
 0 By using KMP, you can do it in O(|S|) by counting the number of times each prefix occurs in the string. Here is the link to my code - https://www.codechef.com/viewsolution/23145983 Here is the link to the explanation - https://cp-algorithms.com/string/prefix-function.html. answered 18 Feb, 18:09 4★mindjolt 22●3 accept rate: 0%
 0 if first two characters of a string are same, will the answer be first character always??? answered 18 Feb, 19:35 1 accept rate: 0% Seems true wow. (18 Feb, 20:57)
 0 Can someone provide c++ code for the logic in the editorial? answered 18 Feb, 20:28 2★mohitmm 5●2 accept rate: 0% Here it is.. https://www.codechef.com/viewsolution/23190005 (23 Feb, 18:29)
 0 Help needed I have solved this problem using LSP like in KMP matching algorithm but not able to pass all the test case. Below is the link of my code. Please help me. https://ideone.com/v5qMfM answered 19 Feb, 00:35 1 accept rate: 0%
 0 We can prove that according to the fact, that our segments built by moving our pointers cannot ever intersect. If 2 of them intersecting, it means that at least one of our pointers point to some index with corresponding letter equal to the string's first letter. BUT when we started a substring from the last appearance of our string's first letter, there's no other appearance in front of it, so it's guaranteed that there will be a mismatch (or it reached the end of the string). I didn't understand this. Can somebody please explain. answered 19 Feb, 02:13 0●1 accept rate: 0%
 0 I understand the solution now, well also, what is the use of k here? and what is exactly k? isnt ans is k? answered 19 Feb, 03:15 1●1 accept rate: 0% let our string be abcdeabcf. On first scan, our k is a as our answer is currently a. Then on 2nd scan, k becomes 2 as answer becomes 2 and later on k becomes 3 when answer is 3. So K is length of our answer at any instance. (23 Feb, 18:28)
 0 This was a good problem for Z algorithm and KMP answered 04 Mar, 19:06 3★lmao_ 56●2 accept rate: 14%
 toggle preview community wiki:
Preview

By Email:

Markdown Basics

• *italic* or _italic_
• **bold** or __bold__
• image?![alt text](/path/img.jpg "title")
• numbered list: 1. Foo 2. Bar
• to add a line break simply add two spaces to where you would like the new line to be.
• basic HTML tags are also supported
• mathemetical formulas in Latex between \$ symbol

Question tags:

×15,852
×3,820
×43
×14