You are not logged in. Please login at www.codechef.com to post your questions!

×

MATPER - EDITORIAL

PROBLEM LINK:

Practice

Contest: Division 1

Contest: Division 2

Setter: Mohamed Anany

Tester: Encho Mishinev

Editorialist: Taranpreet Singh

DIFFICULTY:

Medium-Hard

PREREQUISITES:

KMP String Matching, Meet in the middle and Partial Sums.

PROBLEM:

Given String $S$ of length $N$ and $M$ patterns in the String array $P$, Find number of permutations of Strings in array $P$ which match with String $S$. A permutation $p$ of strings match with string $S$ if we can choose $M$ ranges $[l_i,r_i]$ for each $i (1 \leq i \leq M)$ such that $1 \leq l_i \leq r_i < l_2 \leq r_2 \ldots < l_M \leq r_M \leq N$ such that for each valid $i$, String $S_l,S_{l+1},\ldots S_r$ equals $P_{p[i]}$.

QUICK EXPLANATION

  • Using KMP String Matching, find all positions of all patterns in String $S$. Now reverse both String $S$ and all patterns and once again, using KMP String matching, find positions of all reversed patterns on the reversed string.

  • Now iterate over all bitmasks of $M$ bits which have $M/2$ bits on, and for all permutation of these set bits, find the minimum position $p$ such that all these $M/2$ patterns are completely covered in prefix up to position $p$. Using partial sums, we can now count the number of permutations of set $M/2$ bits which matches with string $S$ before position $p$.

  • For the $M-M/2$ off bits, we try all their permutations and for each permutation, find the position $p$ from the end to start such that all the patterns given by $M-M/2$ off bits are found in suffix from $p$ to end of the string in the given order, without overlapping. So, We know that for this permutation of $M-M/2$ patterns, all the permutations of first $M/2$ patterns ending before position $p$ are valid. So, We can just add val[p] to answer where val[p] denotes the number of permutations of first $M/2$ patterns ending before or at position $p$.

EXPLANATION

First things first, we are going to need the positions where each pattern is present in the string. So, we can just find all match positions using KMP algorithm (or any other string matching algorithm) and store the positions of matches for each pattern. For our purpose, we can preprocess and build an NXT[i][j] table telling the position of the first match of ith pattern at or after jth position, since it makes sense to use the first match due to optimality.

The Naive solution for this problem shall be to iterate over all permutations of patterns and check for every permutation whether that permutation is matchable or not. This solution has complexity $O(M!*N)$ and will definitely time out for $M = 14$.

Let us use Meet in the middle trick here.

Let us split each permutation into two equal (or nearly equal) parts. We can see that first $M/2$ patterns may be any subset of all patterns, so we can try all subsets of $M$ patterns which is of size $M/2$. Now, These $M/2$ patterns may appear in any order and each shall be corresponding to a different permutation. So, Let us iterate over all permutation of all elements of current subsets. We can do the same for remaining $M-M/2$ patterns.

Now comes the interesting part.

Suppose we have a permutation of $M/2$ patterns (Calling it left permutation) (the ones with on bit in bitmask) and a permutation of $M-M/2$ patterns (calling it right permutation) (the ones with off bit in bitmask). How can we check if any pair of left and right permutation form a matchable permutation?

Let us suppose that $p$ is the first position in the string $S$ such that All the first $M/2$ patterns are present in String $S[0, p]$ in order of left permutation such that no two patterns overlap. Also suppose that $q$ is the last position in the string $S$ such that all $M-M/2$ patterns are present in $S[q, N-1]$ in order of right permutation such that no two patterns overlap.

The combination of these two permutations is valid if and only if $p < q$, since only that way all $M$ patterns would be present in $S$ without overlap for current permutation.

But trying every pair of permutation for every bitmask is essentially the same as iterating over all permutation since we check each permutation individually.

But we can use an observation here. If a right permutation is valid with a left permutation with end position $p$, it shall also be valid with any position $j \leq p$.

Hence, for every bitmask with $M/2$ bits set, we can iterate over all permutations of $M/2$ patterns and using prefix sum arrays, count the number of left permutations which end before or at a given position $p$ for all positions $p$. Now iterating over all right permutations, we can easily count the number of left permutations which can be paired with current right permutation. We can increase our answer by the number of such left permutations for each right permutations and print the answer.

For ease of implementation, we can Build NXT table in the same manner, and then reverse both $S$ and all patterns and find the position of matches on these reversed strings. That way, we can easily find the rightmost position such that all patterns of right permutation are present at or after that position without overlap, working in the same manner as we work with NXT table.

Time Complexity

Time complexity is $O(C^M_{M/2}*(N+(M/2)!*M) + M*N+sum(|P_i|))$.

AUTHOR'S AND TESTER'S SOLUTIONS:

Setter's solution

View Content

Tester's solution

View Content

Editorialist's solution

View Content

Feel free to Share your approach, If it differs. Suggestions are always welcomed. :)

This question is marked "community wiki".

asked 13 Mar, 18:32

taran_1407's gravatar image

6★taran_1407
4.0k31104
accept rate: 22%

edited 12 hours ago


I was able to solve it using kind of brute force recursion but with memoization. Not sure if the solution just didn't encounter strong enough test case or it is really valid. Anyway, it ran test cases in 0.07s time.

  1. First using KMP find all instances of each pattern in the string. Store those matches.
  2. The main function $F(mask, L, total)$ finds the number of permutations of the subset of patterns in the main string starting at position $L$. Here: $mask$ - specifies which patterns to include in the count, $L$ - the starting position in the main string from which to start the search of permutations, $total$ - the total length of the patterns specified by $mask$.

  3. The function iterates over set bits of the mask. For each set bit $i$, it finds the first occurrence (starting L) of the corresponding pattern (let's call it $P_i$). Then the function calls itself to count the number permutations of the remaining patterns starting the position right after the identified occurrence of $P_i$. When the mask is zero, the recursion stops with the permutation count of 1.

  4. Just by itself this solution would be $O(M!)$, which is too slow. However the following memoization helps to make it AC:

  5. Each call to the function produces a triplet $(L, L_{act}, Count)$, where $L$ is the starting position for permutations search, $L_{act}\geq L$ - actual starting positions that produced the permutations count $Count$.
  6. For a given $mask$ any starting position in the interval $[L, L_{act}]$ will produce the same count $Count$ of permutations of patterns specified by the mask.
  7. For each mask, we can store these ranges in the map $dp[mask]$ that consists of entries like: $\{L: (L_{act}, Count)\}$
  8. With each new request for the same mask, we can check if $L$ is included in one of the ranges. If not, then permutations are counted and the map is updated with one of the three outcomes: (a) The new entry $\{L: (L_{act}, Count)\}$ is added to $dp[mask]$, (b) The existing entry is updated with the new value of $L$ or (c) The existing entry is updated with the new value of $L_{act}$.

  9. The final solution is $F(mask = (1 << M)-1, L = 0, total = \sum_i |P_i|)$

Her is the solution: https://www.codechef.com/viewsolution/23563394

link

answered 13 Mar, 20:50

shoom's gravatar image

6★shoom
3534
accept rate: 30%

edited 13 Mar, 21:15

My solution was super simple. It just computes dp(L, mask) = how many permutations of the patterns contained in mask can be formed up to position L (inclusive). What we care about is dp(N, 2^M - 1). By itself, there would be way too many states. But we don't care about most of them. We can implement the dp function recursively with memoization. We call dp(N, 2^M - 1). Within a recursive call dp(L, mask) we iterate over all patterns j in the mask and we assume j is the last pattern in the permutation. For a fixed j we need to add dp(L', mask xor 2^j) to the result, where L' is the position just before the rightmost match of pattern j up to position L (L' can be obtained in O(1) time by using something similar to the NXT table from the editorial).

I don't have a formal proof why there won't be too many visited states, but I tried many types of test cases locally and this solution was always quite fast. OK, it took 1.7 seconds on the Codechef servers, but that's well within the time limit. Moreover, I'm pretty sure an important chunk of this duration is caused by the unordered maps I used (I didn't use the well known tricks to set some parameters which usually make unordered maps perform significantly faster than using them with their default params).

link

answered 16 Mar, 02:58

mugurelionut's gravatar image

7★mugurelionut
10.0k266990
accept rate: 19%

edited 16 Mar, 03:01

Yeah, I also did the same thing. But I also made unordered map a bit faster by reserving memory. It ran in 1.18s. Here's the submission : https://www.codechef.com/viewsolution/23495176.

(2 days ago) aviroop1236★

Yes, there were quite a solutions which passed by that.

(12 hours ago) taran_14076★

I tried in the same way as @shoom did, but could only pass the solution for 40 points. Can someone please tell, where was I lagging? Solution : https://www.codechef.com/viewsolution/23525369

link

answered 14 Mar, 18:25

himkha_100's gravatar image

5★himkha_100
211
accept rate: 0%

You are memoizing positions. It is very unlikely to hit exactly same position. My solution is to memoize ranges. Any starting position that falls into a memoized range would have the same result. In addition, if the requested starting position is outside the range, but the resulting count is the same, the range can be extended.

(14 Mar, 19:40) shoom6★

Setter's solution is giving compilation error in C++14(GCC 6.3).

link

answered 2 days ago

jha_gaurav98's gravatar image

5★jha_gaurav98
1
accept rate: 0%

Extra characters are appearing due to latex (First time posting code snippets in editorial). I'm trying to correct it now.

(12 hours ago) taran_14076★

I hope it seems okay now. :)

(12 hours ago) taran_14076★
toggle preview
Preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported
  • mathemetical formulas in Latex between $ symbol

Question tags:

×1,299
×724
×83
×76
×76
×75
×6

question asked: 13 Mar, 18:32

question was seen: 776 times

last updated: 12 hours ago