PP - Editorial

cook63
editorial
medium-hard
palindrome
pp
rolling-hash
strings
trie

#1

PROBLEM LINK:

Practice
Contest

Author: Jingbo Shang
Tester: Xiaoxu Guo
Editorialist: Pushkar Mishra

DIFFICULTY:

Medium-Hard

PREREQUISITES:

Strings, Tries, Rolling Hash, Palindromes

PROBLEM:

Given N words, S[1..N], output the number of ordered pairs (i,\,j) such that S* concatenated with S[j] gives a palindrome.

EXPLANATION:

Let there be two words x and y. Let us look at the cases that arise when they are concatenated (i.e., x+y) from the point of view of palindromes:

  • Case 1: length(x) < length(y):
    In this case, let us assume that y' is that suffix of y which is of the same length as x. If x+y is to be a palindrome then two conditions must be true - x = y' and y-y' is a palindrome. For example, aba and ccaba will form a palindrome upon concatenation but aba and cdaba won’t.

  • Case 2: length(x) > length(y):
    In this case, let us assume that x' is that prefix of x which is of the same length as y. If x+y is to be a palindrome then two conditions must be true - y = x' and x-x' is a palindrome.

  • Case 3: length(x) = length(y):
    In this case, if x+y is to be a palindrome then x = y.

These three cases lead us to formulate an algorithm. We need a data structure which can match prefixes of strings quickly. A trie is exactly for this purpose. We will take the words one by one, and put them in the trie. Before inserting word i, we will first find the number of words already in the trie that it forms a palindrome with upon being concatenated (word i being the second part in concatenation), and then insert it.

For finding the required numbers, we need to store two variables per node of the trie: uptill and below. The variable uptill stores the number of words that have the exact same letters as the ones from the root of the trie to this node. The variable below stores the number of those words w that have a prefix w' which contains the exact same letters as the ones from the root of the trie to this node and w-w' is a palindrome.

Keeping these operations in mind, we can define our insert(w) and getanswer(w) functions. The insert(w) function inserts the word w into the trie and getanswer(w) calculates the number of words already in the trie with which w would form a palindrome upon being concatenated.

We first define how getanswer(w) works. It first reverses w. Let’s call the new string revw. Then it starts at the root of the trie and goes down as per the letters of revw. When it is at a node which is at depth i from the root, it has already processed the first i letters of revw, i.e., the prefix of length i. Let us call the processed prefix revw'. If revw-revw' is a palindrome, then the word w can form palindrome upon being contenated with the words ending at this particular node. Hence, if revw-revw' is a palindrome, we add the value stored in uptill variable of this node to the answer (this counts all possibilities under case 1). Once we reach the node which is the last letter of revw, we add the values stored in the below and uptill variables of this node to our answer. This counts possibilities under case 2 and case 3.

The insert(w) works in a similar way. For this, we don’t have to reverse the word. We start at the root and go down the edges as per the letters in w. When we are at a node at depth i from the root, we have already processed the prefix w' of length i. At this node, we add 1 to the value stored in below variable if w-w' is a palidrome. When we reach the node which is last letter of w, we add 1 to the uptill variable of the node.

How can we efficiently check whether a prefix of a word is a palidrome or not? We can use ‘rolling hash’ to calculate for a given word w of length N an array mark in \mathcal{O}(N) where mark[i] = 1 if the prefix w[0..i] is a palidrome and 0 otherwise. Thus we can preprocess each string before inserting it or calculating the number of strings it forms palindromes with upon concatenation. Please see the author’s/editorialist’s solution for an example of rolling hash. You can read more about it from these links:




http://www.infoarena.ro/blog/rolling-hash

COMPLEXITY:

\mathcal{O}(N)

SAMPLE SOLUTIONS:

Author
Tester
Editorialist


CodeChef Traditions (Unoffical)
#2

Same as http://poj.org/problem?id=3376


#3

The problem was pretty much overkill at implementation.I coded about 160 lines with 2 hashes,2 vector of partial sums for each hash for seeing if a block is palindrome.I was lucky that maps got in TL,otherwise I had to use hand-made hash tables.(because unordered_map doesn’t work for pairs unfortunately…)


#4

My editorial: https://aleigorithms.wordpress.com/2015/10/18/palindrome-pairs/


#5

Apparently test data contains multiple space separated strings in one line, while problem statement says that all strings are in separate lines. It’s not a problem when C++’ scanf is used as in sample solutions, but there are other methods and languages. Also participants pointed out that something wrong with test data during contest in comments to no avail.

Not cool.


#6

For length(X)=length(Y) shouldn’t it be x=rev(y)?


#7

Can anybody clarify my doubt here :
On the test case :
1
2
a
ab

answer should be 1 .i.e. ab|a

but as it is stated in get_answer() , first reverse the string and then go down the trie according to reversed string , but here when i reverse ab .i.e. it becomes ba i just cant go down even 1 level and hence the answer comes out to be zero .

How this thing is to be done , might be i am missing something !!


#8

Let’s say test case is: 1 2 a a. What is correct result?

I think that’s 2 - a(1)|a(2) and a(2)|a(1). Is it right?


#9

I didn’t need tries at all, STL map<>s with hashes are sufficient. If I just take all the strings into an array S_1 and all reversed strings into an array S_2, then I just need to find all pairs of strings S_1* and S_2[j] that have identical prefixes of length L=max(len(S_1*),len(S_2[j])) and the remaining part of each string is a palindrome (it will be empty for at least one string), then all I need to do is subtract the cases i=j. After finding all hashes and reverse hashes of any string, I can find all possible L for such a string. Then, we have two cases: len(S_1*) \ge/< len(S_2[j]). For \ge, all that’s necessary is counting pairs (hash_1*[L],L) = (hash_2[j][len(S[j])],len(S[j])) in a map<>. For <, it’s similar.


#10

@neverauskas it is 1 as you have to select pairs so only one pair is there .


#11

Such a shame, I thought of the same solution, could not implement it in time :frowning:


#12

Wow that’s really a sad mistake by the author considering that he has solved more than 500 problems on POJ and after seeing your comment I googled “concatenate two strings palindrome POJ” and the first link that came up is the link that you have mentioned.


#13

A pair<int,int> can be represented as a long long. If you have more than ints, use a suitable large modulo.