Problem LinkAuthor: Ivan Fekete Tester: Misha Chorniy Editorialist: Bhuvnesh Jain DifficultyMEDIUM PrerequisitesDynamic Programming, Tries, Hashing ProblemYou are given a empty string $S$ and a taget string $T$. Youa re also provided with $N$ patterns $p[i]$. You may perform any number of operations as follows:
Find the minimum cost to make string $T$ from $S$ using the above operations. Note that L denotes the current length of the string $L$, before the operation is performed. ExplanationThe problem statement clearly lists some transitions and cost related to it and we would like to minimise the overall cost. So, this hints towards a dynamic programming approach. Let us define $dp[i][j]$ as the minimum cost to make the string $T[i..j]$ using the above operations. The transitions are exactly the same as given in the statement. Let $L = (j  i + 1)$.
$dp[i][j]$ is clearly the minimum over all possible transitions. The time complexity of the above logic is $O(S * S * N * P[i])$. This is enough to pass the first subtask. But there is a small caveat while implementing the above solution. The most general way to implement dp solution is as follows:
But the above implementation is wrong as the transitions for $dp[i][j]$ require us to have the correct values of all $dp[x][y]$ calculated where $i <= x <= y <= j$. So, we want to calculate the dp of smaller substrings first and then extend it to bigger substrings. A working pseudocode for the first subtask is as follows:
Full Solution:Note that in the above solution, the only time consuming step is the transitions from all possible strings $p[y]$. Since we know that the maximum length of any string $p[y]$ is atmost 100, we see that $dp[i][j]$ would take atmost 102 transitions (2 for characters). If we can do each transition in $O(1)$ then we can completely solve the problem. The first thing to observe is that we have atmost 100 different strings which could be appended to the left or right while calculating the cost of a substring. Instead of iterating over all possible patterns $p[y]$, we need to efficiently find the cost of making that substring (if possible) from the set of patterns. This requires efficient string matching from a set of patterns. Below are 2 different approaches to it: Author's solutionMake $2$ tries of all the patterns, one for the left side transitions and another for the right side transitions. For each node in the trie also store the minimum cost of making that string. For all nodes, it is initially $INF$, some large value. Only for nodes where a pattern $p[y]$ ends, the cost is set to the minimum of $kl[y]$ or $kr[y]$ over all possible patterns ending there (Note that there can be multiple same patterns but with different cost). To efficiently search for the cost of the substring, we simply traverse the trie and after each transition update the $dp$ values. The overall time complexity of the above solution is now $O(S * S * {max}*{i=1}^{i=N} P[i] + \sum*{i=1}^{i=N} P[i])$, where the first part is for dynamic programming calculating and the second part is for building the trie. The space complexity of the solution is $O(S * S + N * P * 26)$, where the first part is for the dynamic programming table and second is for the trie of patterns. Once, you are clear with the above idea, you can see the author implementation below for help. Editorialist's solutionWe can use hashing for string comparison as well. So, we hash all the patterns and store the cost of them in a map. We also store the hash of text $T$ as well. Now, we calculate the cost of every substring as follows:
The time complexity for the above approach is $O(\sum_{i=1}^{i=N} P[i] + S + S * S * \log{N} + S * S * P)$, where the first $2$ parts are for building the hashes, the third for the above pseudocode ($\log{N}$ for searching the map) and last part for dynammic programming approach. The space complexity of the approach is $O(S * S)$ which is required for storing the dynammic programming table and also the cost of every substring. NOTE: For the hash based solution the string lengths are small and also the number of strings to consider i.e. $N$ are large so there can be chances of collision. So, it is preferred to use double hashing in this problem. For details refer to this blog on codeforces. Once, you are clear with the above idea, you can see the editorialist implementation below for help. Feel free to share your approach, if it was somewhat different. Time Complexity$O(S * S * {max}*{i=1}^{i=N} P[i] + \sum*{i=1}^{i=N} P[i])$ or $O(\sum_{i=1}^{i=N} P[i] + S + S * S * \log{N} + S * S * P)$ Space Complexity$O(S * S + N * P * 26)$ or $O(S * S)$ AUTHOR'S AND TESTER'S SOLUTIONS:Author's solution can be found here. Tester's solution can be found here. Editorialist's solution can be found here.
This question is marked "community wiki".
asked 28 Jul '18, 16:58

In Author's solution maybe correct will be O(S * S * max(P[i])), because from every [l,r] there we will go for whole tries in worse case? answered 29 Jul '18, 04:57
1
@batura_dima you are correct. Maybe I should have mentioned what I meant by P as it clearly is not the size of one particular string.
(29 Jul '18, 12:51)
As P is array, so usually P means size of array. I thought there are speeded up traverse of trie, but as I see it is usual. We can achieve O(S * S * P) for main calculation, if we preprocessed for every position in T set of reachable P[i], so for every i we will only once do traverse of trie (in both directions of course). So it will be O(S * S * P + S * max(P[i])).
(29 Jul '18, 14:34)
@batura_dima, how do you conclude $O(S * S * P)$ for main calculation? Btw I have updated complexity section in the editorial.
(29 Jul '18, 19:41)

I've described untidily. For every char in T we will preprocess set of reachable P[i] (in both directions of course). It will take O(S * max(P[i])). Then for every [l,r] we will already have possible transitions by strings from P. answered 30 Jul '18, 04:15

It seems that in solution with hash time complexity for main calculation is O(S * S * max(P[i]) * lg(P)). Because of same reason. For every [l,r] we go to distance max(P[i]). answered 30 Jul '18, 04:28
@batura_dima updated. Where do you get the additional $O(\log{P}$ factor from?
(30 Jul '18, 10:22)
You wrote yourself: "(logN for searching the map)"
(30 Jul '18, 15:24)
That part comes in precomputation where cost of every substring is stored. If you do it during each DP calculation, it will lead to TLE as the number of operations would be $1000 * 1000 * 100 * \log(10000)$. Also, the factor is $O(\log N)$ where N is the number of string whose hash value is added in the map, not $O(\log(P)$. You can refer to my code if something is not clear.
(30 Jul '18, 16:36)
