PROBLEM LINK:Setter Arjun Arul DIFFICULTY:MEDIUM PREREQUISITES:KMP Algorithm, Looping, Logical thinking. (Similar pattern finding algorithm may also work. The editorial is based on KMP Algorithm for pattern finding.) PROBLEM:Given a pattern and text, you need to find how many times the pattern, or a power of the pattern exists in the text. QUICK EXPLANATION:We notice that "power" of the sequence makes it unfeasible to do KMP Search on original array. So we modify it to store the character/number and how many times it occurs consecutively. We can now do KMP search in this array. We first search for where the pattern exists, irrespective of whether its a "power" of it or not. Then we do KMP search for, if some "power" (of any sequence) exists, irrespective of what sequence it is, or if its the given sequence or not. Then, we find where both of these (i.e. pattern and power) intersect, i.e. things/indices/positions which are common in both. The number of common instances gives us the answer. EXPLANATION:This is a very specific question in my opinion. Its quite easy actually, if you know what KMP Algorithm, or any other pattern finding algorithm works. If you dont, then it might be a bit hard for you. Nonetheless, I will try to make it as easy as possible. Just make sure to head over and learn KMP Algorithm first!! The link is in prerequisite. The editorial is divided into 3 parts discussing 3 steps each helping to find the answer, and one final conclusion on how to finally calculate answers. 1. Making the array/text KMPSearchable Reading the question, and the KMP algorithm, one thing is clear We cannot directly apply KMP Search in original array. Usual KMP Search works for power $1$ only. The Problem When power (say $k$) is more than $1$, same consecutive characters occur $k$ times. It will hinder our construction of lps array, and will give a big difficulty in comparing text and pattern to determine if pattern or its power exists or not. Solution Its actually simple. We "reduce" the array to power 1. What we do, is, that for ever character $ch$, we store the character and how many times it occurs consecutively after that. Eg
& etc. We can easily do so by creating 2 arrays, or our own custom data type. Now we can do KMP Search on this array using normal KMP Algorithm!! Note that we must similarly modify the pattern as well!! 2. KMP Search of Pattern, irrespective of power This step is nothing but usual KMP Search. We will check at what places the pattern exists, IRRESPECTIVE of the fact if its a valid power or not. Eg
So not much for this section. Its the usual KMP Search, we will just search for pattern, and store the starting index of the pattern. 3. KMP Search for Powers Now we bring in powers. This section needs to address some questions. First, how do we find power? We already stored how many times a character occurs in text, along with the character, so this will not be problem. The problem will be, finding what power is it. Is it to the power 1? to the power 50? How do we accurately check that? This is actually more complex than one thinks and has an edge case in it. Hence, its divided into 2 parts. a. Sequence has more than 4 elements We have one problem here. Say our sequence was In above example, our modified array will look like {(1 3},{2 2},{3 2},{4 5}). If we start from second $1$, and end at second $4$, we will find the pattern. What I want to say is, we cannot rely on starting and ending element to determine the power. All characters of starting and ending element may not be included in pattern! Eg Here 4 occurs 5 times but only two fours are included in pattern. However, note that middle elements are always included in pattern wholly. This will help us finding the power. What we will now do, is that, we will compare ratio of occurrence of second and third element in text to second and third element in sequence. As an example, in above example, we will check ratio of element $2$ and element $3$ in sequence and in text. If its equal, then it represents a valid power. However, checking ratio will involve floating point and decimals. So lets tinker the expression a little as Let Text(i)= Current Frequency element i occurs in text, and Pat(i)=Current Frequency element i occurs in pattern. The proposed formula is $Text(i+1)/Text(i)==Pattern(i+1)/Pattern(i)$ A simple cross multiplication and we get $\implies Text(i+1)*Pattern(i)==Pattern(i+1)*Text(i)$. Thus, we get rid of any decimals. In a gist, to see if the sequence in text is actually a power of given sequence, we check that ratio of $Text(i+1)/Text(i)==Pattern(i+1)/Pattern(i)$ for all $i$ such that $i$ is not the starting and ending element of the pattern. b. When sequence has only 3 elements This is a bit tricky. We dont have 2 middle elements to find if the sequence in text is a valid power or not. So, now what? Turns out, we can easily deal with this case as well. All we need to make sure that we have "sufficient" starting and ending elements, which can combine with frequency of middle element to give a power. Eg if original sequence is "1 2 2 3" and original text is "1 1 1 2 2 2 2 3 3" , we see that middle element 2 occurs 4 times in text, and 2 times in sequence. This hints a power of 2. Now we see if we have sufficient 1s and 3s , as such to satisfy power 2 of the sequence. In this case, we clearly have that. So if we have only 3 elements in the sequence, the test in part a. wont hold good, we just check for pattern and use ifelse for checking if we have sufficient starting and ending elements or not. You can look at eidtorialist's solution for details. 4. Concluding the final answer Now, we have positions where we see a valid pattern exists. Then, we also have positions where a valid power exists. We need to find number of positions where a valid pattern and a valid power exists which is nothing but number of common elements in then. So, we take the elements common in then, do checks like
And thats how we are done with another question :) SOLUTION:$TimeComplexity=O(N)$ CHEF VIJJU'S CORNER:
This question is marked "community wiki".
asked 13 Feb '18, 14:42
