You are not logged in. Please login at www.codechef.com to post your questions!

×

PATTMATC - Editorial

PROBLEM LINK:

Practice
Contest

Author: Sergey Kulik
Tester: Yanpei Liu
Editorialist: Pawel Kacprzak

DIFFICULTY:

Medium-Hard

PREREQUISITES:

Strings, Pattern matching, KMP

PROBLEM:


The problem is stated really simple. You are given a string $S$ of $N$ lowercase Latin letters. In addition, you are given a wildcard string $T$, of length $M$, consisting of lowercase Latin characters and asterisk symbols '*'.

We said that $T$ generates a string $P$, if and only if, $P$ can be obtained from $T$ by replacing all the asterisks with any, even empty, lowercase Latin letter strings.

The task is, for each suffix $S[i..N]$ of $S$, compute the position of the first occurrence in this suffix, of a string $P$, which can be generated from $T$.

QUICK EXPLANATION:


Split $T$ on asterisks to get a list $L$ of patterns. Find occurences of all patterns from $L$ in $S$. For each suffix $S[i..N]$ of $S$, find in $S[i..N]$, the leftmost chain of non overlapping occurrences of all consecutive patterns from $L$.

EXPLANATION:


The first observations, which we can make, is that, we can replace any run of more than one asterisk with a single one.

After that, we can split $T$ on asterisks and get a list $L$ of $K$ patterns, where $K$ is not greater that the number of asterisks + 1.

The general idea is to use a pattern matching algorithm to find all occurrences of each pattern from $L$ in $S$. Since we know that the number of asterisks is at most $500$, we can compute the array $A[i][j]$, where $A[i][j] = b$, if and only if, the first occurrence of the $j^{th}$ pattern from $L$ in $S[i..N]$ is at index $b$ in $S$.

Based on the above idea, we can iterate over all suffixes of $S$, and for each of them, using the $A$ array, we can easily compute the leftmost, non-overlapping occurrences of all consecutive patterns from $L$, which is equivalent to the leftmost occurrence of a string generated by wildcard string $T$.

Time Complexity


Notice that, the time complexity of this approach depends on the method used to find occurrences of patterns from $L$ in $S$.

Naive pattern matching can be sufficient to pass some subtasks, but if you want a full credit, and we hope you do, you have to use any linear pattern matching algorithm, like KMP for example. If you do that, the total time complexity of a single test case is $O(K \cdot N)$, because we use linear pattern matching to find occurrences of $K$ patterns in a string of length $N$, and next, for each of $N$ suffixes, we make at most $K$ jumps over $A$ array.

AUTHOR'S AND TESTER'S SOLUTIONS:

Author's solution can be found here.
Tester's solution can be found here.

RELATED PROBLEMS:

This question is marked "community wiki".

asked 27 Sep '15, 08:36

pkacprzak's gravatar image

5★pkacprzak ♦♦
74485097
accept rate: 12%

edited 27 Sep '15, 14:11

admin's gravatar image

0★admin ♦♦
19.8k350498541


To previus: It is because the string "bacb" CONTAINS the string generated from "a*b" (which is "acb"). So 5 should be a correct answer at i=2. I had the same mistake as you with this problem during the contest and it took me about an hour before I can realize that :(

link

answered 27 Sep '15, 22:18

yenthanh132's gravatar image

6★yenthanh132
111
accept rate: 0%

edited 27 Sep '15, 22:20

Hello, I want to ask whether it is possible to use regex to solve this problem. Another question is, how fast is regex in matching strings compared to algorithms like KMP? Thanks

link

answered 27 Sep '15, 14:14

ninjascript's gravatar image

3★ninjascript
1
accept rate: 0%

Obviously depends on the regexp implementation you use. Since they are much more general they will propably not as fast as the special algo employed here. But it could be fast enough to pass first subtasks -- try it.

(28 Sep '15, 00:16) ceilks7★

In this case:

1 ab abacb I got this: 2 5 5 -1 -1 from tester and Author solution, but I think for i = 2, you can't get s[2, 5]=bacb from a * b, because there are not '' at the begining of T

The answer should be 2 -1 5 -1 -1 , anyone?

link

answered 27 Sep '15, 14:30

previus's gravatar image

5★previus
11
accept rate: 0%

edited 27 Sep '15, 14:30

Please rejudge the solutions

(27 Sep '15, 14:40) previus5★
toggle preview
Preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported
  • mathemetical formulas in Latex between $ symbol

Question tags:

×15,680
×1,250
×643
×70
×55
×35

question asked: 27 Sep '15, 08:36

question was seen: 2,564 times

last updated: 28 Sep '15, 00:16