Given a string in “compressed notation”(see problem), find the number of substrings that are palindrome.
Explanation:
The problem was a straightforward application of Manacher’s Algorithm. This Algo was originally proposed to find the longest palindrome in a string. However, for each “center” c, it finds the length of longest palindrome “centered” at c. Therefore, we can use it to find the number of palindromes “centered” at c as well. Handling the fact that strings are given in compressed notation is relatively straightforward.
Many elegant descriptions of Manacher’s Algorithm can be found on internet.
In very short, Manacher’s Algorithm can be written as follows.
1 | int p[N+1], mx = 0, id = 0;
2 | // length of longest palindrome centred at i is 2 * p[i]-1.
3 | for (i = 1; i <= N; i++) {
4 | p[i] = mx > i ? min(p[2 * id-i], mx-i) : 1;
5 | while (s[i + p[i]] == s[i - p[i]]) p[i]++;
6 | if (i + p[i] > mx) {
7 | mx = i + p[i];
8 | id = i;
9 | }
10| }
Let the compressed string be (c1, k1), (c2, k2), … (cN, kN).
Assuming reader understands Manacher’s Algorithm, here is how to modify it for this problem:
Palindromes that contain only a single character repeated several times can be counted as:
Σ ki * (ki +1) / 2
Palindromes that span over more than 1 contiguous segment of compressed string can be computed by also maintaining an array q, which stores the length of the longest “decompressed” palindrome centred at ith segment.
We need not put interleaving '#'es because center of every palindromic substring is the center of some “compressed” segment(unless it fully lies inside a segment).
In line no 5, compare the characters as well as the lengths of the segments.
If the segments i-p[i] and i+p[i] do not have same lengths, but have the same character, then q[i] would need to adjusted by adding 2*min(ki-p[i], ki+p[i]).
I used hashing + binary search in order to find the longest palindrome centered at each of the K groups of identical characters. This gave me an O(K*log(K)) time complexity instead of O(K) (as is the case with Manacher’s algorithm), but I personally prefer to use hashing whenever possible, because it’s a much more general technique (applicable to a wider range of problems) and it’s very easy to implement.
I wrote a very naive solution, which as I now see should most probably get TLE. But for some reason it gets WA. It works fine on any tests I can think of. Can anybody help find the reason for this WA? http://www.codechef.com/viewsolution/2397315
I would like to clarify about somethings. I know that we have to calculate the number of palindromic substrings centered at each letter. Here is what I did but it still gives me wrong answer.
Assume A[i] is a pair or char and long long after adding (#, 0) between the letters.
1- For each A[i] init P[i] = 0. (This means am not using the symmetry property)
2- init V[i] = A[i].Y * (A[i].Y + 1) / 2;
3- while I can still expand this palindrome I wrote this code:
while(A[i - P[i] - 1].X == A[i + P[i] + 1].X) {
4- When manacher’s algorithm is over I add all the values of V[i] to get the final result. What is wrong with this way? Why is it giving me a wrong answer. If you’d like to have a look at the full code here it is: CodeChef: Practical coding for everyone
I tried my code with thousands of different inputs and it always gave the exact same answer as successful submissions, but I get WA. I wonder if someone can point me to any test case where my code fails. CodeChef: Practical coding for everyone
As per my understanding, we can solve this problem by writing a c program trying to find longest palindrome string centered at a character. Let the center character is ‘x’, then we will move in both direction of x and check if this sub string or not. IF we find that it is not a palindrome string then we can stop further scan and jump to next adjacent character in the given string. I donno how to write correct program for this … but this algorithm looks good to me.
Two ‘characters’ of compressed string are equal if the corresponding (char, int) pairs are equal. No need to explicitly decompress the string, just do calculations smartly.
In your solution, you calculate powers of 1e9+7 upto 200000 in an unsigned long long and also do the hash calculations in ULL. How is it that the ‘overflowed’ values don’t give an error and the algorithm still works ?
Oh. Thanks! Also, do you know why we need to a multiply the RHS value by a certain power(specifically 4*mid-2) while calculating the difference in the forward and reverse hash ?
Got my error. my hash function wasn’t good enough. using @mugurelionut’s hash func, got AC @sanchit_h Haven’t seen that part of his code thoroughly. But I can guess that it is due to his nature of hash function. See my code if it makes any better for you.
@sanchit_h: My hash function considers that we have a sequence of 2N values: character1, count1, character2, count2, …, characterN, countN. That’s why I needed powers of P up to 2N basically (and not only up to N). Then, when computing the direct hash value for a substring (in my solution from i-mid+1 to i+mid-1) we need the “prefix” hash up to i+mid-1 from which we need to subtract the contribution of the “prefix” hash up to i-mid. But this contribution appears multiplied by P^(2 * length), where length=2*mid-1 (thus, 2 * length = 4 * mid - 2). The hash for the reverse string is similar.