PROBLEM LINK:
DIFFICULTY:
Medium - Hard
PREREQUISITS:
DP , Finite Automata/KMP , Bitmasking
PROBLEM:
Given positive integers N, K and M, count how many binary strings S of length N exist such that there exist more than M indices i such that LCP(S[1, i], S[i + 1, N]) ≥ K and 1 ≤ i < N.
QUICK EXPLANATION :-
The problem is equivalent to finding the numbers of strings such that number of occurrence of S[1,k] in S[k+1,N] is >M. We will fix the first k bits by using a k-bit mask and solve the problem for each mask differently. Each problem can be solved by using a simple dp having a count of the number of occurrences till now, and the current match of the string alongwith the length. Then either add 0 or 1 . The complexity is O( 2^K * K * N * M).
DETAILED SOLUTION:-
For any string first lets find number of i such that LCP(S[1, i], S[i + 1, N]) ≥ K and 1 ≤ i < N. If LCP(S[1, i], S[i + 1, N]) ≥ K, then it means that S[i+1,i+K] = S[1,K]. So basically if for a fixed K, this is equal to the number of occurrence of S[1,k] in S[k+1,N]. Now for given N,M,K we need to find number of binary strings in which the count of such occurrences are >M.
Since K<=10. We can solve the problem for each possible S[1,k] (2^k different strings).
Let us denote S[1,k] with a bitmask ‘mask’ and solve the subproblem differently for each mask and sum up the result. Now we will use dp to find out the no of such strings of length N-K , in which the mask occurs >M times.
Let dp[len][matchCount][prefixMatch] denote the count of strings of length len , which have the mask occuring in them exactly matchCount times and currently suffix of maximum length prefixMatch matches the prefix of same length of mask.
For the prefixMatch part, we we can use Finite Automata. I guess it can also be done by modifying KMP.
Finite automata is more suitable and easy to visualize for this problem. You can read more about pattern matching from finite automata online. For the current prefix of the string, it keeps track of the max length suffix that matches the some prefix of the pattern. We create a transition table state[a][b] for each mask, in which a stands for the current state and b stands for the next character. Now state[a][b] denotes the next state in which it transitions. Here the index of the state denotes the maximum length prefix of pattern which matches with some suffix of the string.
Now whenever a compete match occurs we can update matchCount. And sum dp[N-K][x][y] for all x>M , and all y. The transitions are as follows:-
dp[a+1][b+(states[str][c][0]==k?1:0)][states[str][c][0]] += (dp[a][b][c]);
dp[a+1][b+(states[str][c][1]==k?1:0)][states[str][c][1]] += (dp[a][b][c]);
where states[str][x][y] denotes the value of state[x][y] for mask str.