PROBLEM LINK:
Author: Ke Bi
Tester: Kevin Atienza
Translators: Vasya Antoniuk (Russian), Team VNOI (Vietnamese) and Hu Zecong (Mandarin)
Editorialist: Kevin Atienza
Contest Admin: Praveen Dhinwa
DIFFICULTY:
Medium-Hard
PREREQUISITES:
Suffix array, longest common prefix, string hashing, range minimum query, segment tree
PROBLEM:
A tandem is a nonempty string repeated thrice, i.e., a string of the form AAA for a nonempty string A. A tandem is interesting if the character following each instance of A are not all the same, otherwise it is boring. Given a string s of length N, how many interesting and boring tandems does s have?
QUICK EXPLANATION:
We enumerate interesting and boring tandems per length of A. Let’s say |A| = L. Thus, the length of the tandem is 3L. Fix this L.
Consider the positions 0, L, 2L, 3L, \ldots. A tandem of length 3L will cover exactly three such positions. So we count boring and interesting tandems per three positions i, j, k with (i,j,k) = (i,i+L,i+2L).
- Let LCP(i,j,k) be the length of the longest common prefix of s[i\ldots N-1], s[j\ldots N-1] and s[k\ldots N-1].
- Let LCS(i,j,k) be the length of the longest common suffix of s[0\ldots i], s[0\ldots j] and s[0\ldots k].
Let V = \min(LCP(i,j,k),L) + \min(LCS(i,j,k),L) - 1. Then:
- If V < L, then there are no tandems.
- If V \ge L and LCP(i,j,k) > L, then there are V-L+1 boring tandems and 0 interesting tandems.
- If V \ge L and LCP(i,j,k) \le L, then there are V-L boring tandems and 1 interesting tandem.
The answers are the sums of all these contributions across all $L$s and across all $(i,j,k)$s.
EXPLANATION:
We will use the notation s_{i, j} to denote the substring from s_i to s_j. Thus, |s_{i,j}| = j-i+1.
Faster than O(N^3)
The obvious O(N^3) brute-force algorithm here doesn’t pass, because N is very large. (And actually, even O(N^2) shouldn’t pass.) So let’s try to improve on O(N^3). Here we’ll try to show a solution that will lead to the intended one.
Let’s enumerate the tandems according to the length of A. Suppose |A| = L, so the tandem AAA has length 3L. For now, let’s fix this L and then count all boring and interesting tandems of length 3L. Ideally our solution will be fast enough so we can do this for each L \in [1,N/3].
Consider the characters s_0, s_L, s_{2L}, s_{3L}, \ldots. There are approximately N/L such positions. But since these are spaced L characters apart from each other, it follows that every tandem of length 3L covers exactly three of these. Therefore, let’s fix three adjacent characters, say s_i, s_j, s_k where (i,j) and (j,k) are spaced L characters away from each other, then count all boring and interesting tandems covering these three.
The tandem must start at some position in [i-L+1,i]. Otherwise, if it starts at some position > i it will not be able to cover s_i, and if it starts at some position \le i-L, it will not be able to cover s_k. But it can’t just start at any such position! If we want the string s_{a, a+3L-1} to be a tandem, the following must hold (by definition):
But due to the way we selected the position a, we find that a \le i \le a+L-1, a+L \le j \le a+2L-1 and a+2L \le k \le a+3L-1. It means that we can decompose the equality above into two parts:
We can restate these two equalities in another way. The first one is the same as:
The strings s_{0, i}, s_{0, j} and s_{0, k} must have a common suffix of length i-a+1.
The second one is the same as:
The strings s_{i, N-1}, s_{j, N-1} and s_{k, N-1} must have a common prefix of length a+L-i.
This now gives us a way to compute the number of tandems covering s_i, s_j and s_k. Let’s first define two things:
- LCP(i,j,k) is the length of the longest common prefix of s_{i, N-1}, s_{j, N-1} and s_{k, N-1}.
- LCS(i,j,k) is the length of the longest common suffix of s_{0, i}, s_{0, j} and s_{0, k},. (Not to be confused with the longest common subsequence.)
There is a tandem starting at a if i-a+1 \le LCS(i,j,k), a+L-i \le LCP(i,j,k), and a\in [i-L+1,i]. Rearranging these inequalities gives:
which means the number of valid $a$s is
or 0 if this number is negative. We can actually simplify that number to the following:
But which among these are boring, and which are interesting? Notice that if s_{a,a+3L-1} and s_{a+1,a+3L} are both tandems, then s_{a,a+3L-1} is automatically boring, because the following characters are the same. It means that only the last tandem, s_{i,i+3L-1} can be interesting. To check if it’s interesting, simply check if s_{i+L} = s_{j+L} = s_{k+L}. If this is true, then the string is boring, otherwise it is interesting!
Computing \min(LCP(i,j,k),L) and \min(LCS(i,j,k),L) naïvely takes O(L) time each. Since we will do this approximately N/L times for each length L, the running time is therefore O(N/L\cdot L) = O(N) per L. So if we do this for each L\in [1,N/3], the running time is O(N^2). This is much better than O(N^3)!
Faster solution
To improve that solution, we need a faster way to compute LCP(i,j,k) and LCS(i,j,k). We’ll describe various solutions here.
Solution 1: Hashing + Binary Search
If we can compare any two substrings for equality in O(1) time, then we can improve computing LCP and LCS quickly using binary search. One way to do substring comparisons quickly is with hashing!
Specifically, we will use polynomial hashing. We fix a base B and a modulus M. Let’s now define the hash function H(S). For the string s = s_0s_1s_2\ldots s_{N-1}, its hash will be the following value:
where v(c) is another function that maps a character to a number.
We will use this hash to compare substrings in O(1):
- Preprocess the string so that H(s) can be computed in O(1) time for any substring s.
- To compare two substrings s and t, simply compare their hashes, H(s) and H(t).
If two strings are equal, then their hashes must be the same. Unfortunately, if they are not equal, the hashes might still be the same. For this reason, this hashing algorithm fails sometimes. But if you choose B, M and v(c) well enough, then you’ll probably still get accepted
So how do we preprocess the string? The key advantage of polynomial hashing is the following properties: (assuming we have precomputed powers of B modulo M)
- Concatenation. Given H(s) and H(t), H(st) can be computed in O(1) time as H(st) = (H(s) + H(t)B^{|s|})\bmod M.
- Splitting. Given H(st) and H(t), H(s) can be computed in O(1) time as H(s) = (H(st) - H(t)B^{|s|})\bmod M.
These properties are what we can use to preprocess. We simply compute the hashes of all suffixes of the string. This can be done in O(N) by using the concatenation property above. Then, to compute the hash of a substring s, find the appropriate suffixes st and t, then compute H(s) from H(st) and H(t) using the splitting property!
So what’s the running time of this algorithm? Notice that LCP and LCS are computed with binary search, which takes O(\log N) time each, so doing that N/L times gives O(N/L \log N). Summing from 1 to L gives
owing to the fact that the Harmonic series grows as O(\log N).
Solution 2: Suffix array + segment tree
Computing longest common prefixes is a standard step when you already have the suffix array. After constructing the suffix array, constructing the LCP array takes just O(N). Afterwards, to find the longest common prefix of two suffixes, s_{i,N-1} and s_{j,N-1}:
- Find the positions of i and j in the suffix array. Let’s say they are i' and j'. Without loss of generality, assume i' \le j'. (We can swap i and j otherwise.)
- The LCP is now \min(LCP[i'+1],LCP[i'+2],\ldots,LCP[j']).
Modifying this for three suffixes is straightforward. Thus, we have reduced LCP to a range minimum query, which is pretty standard Similarly, you can construct the "LCS array" by constructing a suffix array for the reverse of s, so LCS can be reduced to a range minimum query as well.
How do we compute range minimum queries? The most standard way is to use segment trees. This way, each range minimum query takes O(\log N) time to compute. Thus, the running time is again O(N \log^2 N) like above. (Note that the suffix array can be constructed in O(N \log^2 N) time.) But if you implement a segment tree normally, you might get TLE, even though the time complexity is correct! If this happens to you, here’s an improvement: Notice that we actually only need \min(LCP(i,j,k),L). Thus, we can add a third argument to our range_min
query call, which denotes the current upper bound. range_min
now looks like this:
def range_min(i, j, L):
if this.min >= L: // special pruning.
return L // if this.min >= L, it's impossible to improve upon 'L'.
if i <= this.i and this.j <= j:
return this.min
if this.j < i or j < this.i:
return L
return left.range_min(i, j, right.range_min(i, j, L))
Notice that the partial results are getting passed everywhere, so that the pruning done by this.min >= L
is maximized. With this, I was able to get the segment tree solution accepted!
Solution 3: Suffix array + sparse table
Another way to compute range minimum queries is by computing a table of minimums. Let M*[k] be the range minimum in the range [i,i+2^k-1]. We can build the table M*[k] using the following property:
Thus, this construction takes O(N \log N) time. Afterwards, each range query runs in O(1) time, because the minimum in the range [i,j] is \min(M*[k], M[j-2^k+1][k]), where 2^k is the largest power of two \le j-i+1. (Now, computing k takes O(\log N), but you can just precompute it in O(N) time for each length j-i+1.)
Notice that we moved the \log N factor from query time to construction time; whereas segment trees take O(N) time to construct and has O(\log N) query time, this solution takes O(N \log N) time to construct and has O(1) query time. But this change is significant, because the number of queries that we have is actually:
which means that the overall running time (after constructing the suffix array) is
which is an improvement over O(N \log^2 N)! (Note that there are O(N\log N), and even O(N) algorithms to construct the suffix array.)
Time Complexity:
O(N \log^2 N)