FAVNUM - Editorial

aho-corasick
dynamic-programming
editorial
july12
medium

#1

PROBLEM LINKS:

Practice
Contest

DIFFICULTY:

Medium

PREREQUISITES:

Binary Search, DP, Aho-Corasick/KMP

PROBLEM:

You’re given some digit strings of length <= 18. A number is said to be lucky if its decimal representation contains at least 1 of these digit strings as substrings. Your task is to find out Kth number amongst the lucky numbers in the range [L…U].

QUICK EXPLANATION:

Do a binary search over the answer. A predicate is number of beautiful numbers in some interval [A…B]. Further break it into two questions : number of beautiful numbers in [1…B] and number of beautiful numbers in the range [1…A-1]. Now general question to solve is to find out number of beautiful numbers in range [1…U] for arbitrary U. Start padding numbers from left to right and maintain a DP. For details of DP, look at detailed explanation.

DETAILED EXPLANATION:

We’d solve this problem using DP - this is actually a hybrid of two standard
DPs but it requires a fair degree of maturity if you’re solving this type of DP for the first time.
Before going to actual solution, let me first try to motivate such a solution.

  1. We can find out the Kth such number using binary search : we choose a
    candidate number N and ask a question : how many numbers between L and N
    are beautiful numbers.

  2. So we need to find out number of beautiful numbers between a given A and B.

  3. We can find out the number of beautiful numbers in [1…B] and subtract from
    it the number of beautiful nunbers in [1…A-1].

  4. So we need to focus only on the following problem : find out the number of
    beautiful numbers in the range [1…U] for any arbitrary U.

  5. We’d construct numbers by padding digits from left to right. At any point of
    time we need to ensure that we’ve not crossed U. Now consider than that we’ve
    written d digits so far. There are two possibilities - either all d digits are equal to d left most digits of U or first few digits are same as those of U and then next digit is smaller than the next digit of U, else whole number would overshoot U. Note that as soon as one of our digits becomes less than corresponding digit of U, we can pretty much fill remaining digits arbitrarily as resulting number can’t be greater than U. So we need to keep a track - if our generated number is ‘tight’ with U or it is slack of constraint (of being less than U) already.

  6. Also we need to know if we’ve already seen one of chef’s favorite digit
    strings or not.

  7. While we’re filling up digits from left to right, we might need to remember
    some digits as maybe after adding some new digits, we’d create one of chef’s
    favorite digit strings.

  8. But we don’t need to remember all the digits written so far. All we need to
    remember is the longest prefix of some favorite digit string which is also a
    suffix of numbers generated so far.

  9. When we introduce a new digit, this longest prefix of some favorite digit
    string which is suffix of numbers so far, might change. We need to be able to
    track that.

  10. This state is equivalent of storing : index of favorite digit string and
    length of this prefix. There are <= 62 such strings and each string is of
    length <= 18. So total around 62 * 18 possible states.

  11. For each of these 62 * 18 states and for each next digit (10 possibilites),
    we need to find new state. That is around 62 * 18 * 10 precomputation per test
    case.


Actual DP:

DP[state][L][tight][beautiful] : tells us the number of beautiful numbers that we can create when
the longest prefix of any digit string which is suffix of number generated so far is given by state, we’re to add L more digits. tight is boolean which is true if all of our digits are equal
to corresponding digits of U and false if atleast one digit so far is lower than
corresponding digit of U. beautiful is again boolean which is true if we’ve
already seen a favorite digit string and false otherwise.

We originally need to solve DP[0][L][true][false] where state 0 denotes no match
so far, and L is length of decimal representation of U.

Size of DP is : 62 * 18 * 18 * 2 * 2 which is roughly 8 * 10^4
Per state of DP we’ve to move over 10 digits to be tried now and so time taken
to solve DP is 8 * 10^4 * 10 which is roughly around 8 * 10^5.
This DP we’ve to solve around log(10^18) times for binary searching. So total
time taken is around 3 * 10^7 which should pass in time. Also note that
average evaluation time would be lower than 10 because of the constraint of not
crossing U. Also as soon as beautiful becomes true and tight becomes false, we
can return the answer immediately = 10L (as we can freely put any digit at any
position).

We can further reduce the time per DP to O(L) by making an extra observation : as soon as
we get tight = false in our DP, U has become irrelevant to further calculation. What it means is that we can precompute remaining DP once. And every time we need to solve the problem of finding
number of beautiful numbers from 1 to U for some U, we solve it in time O(L) using this precomputation.

This DP explicitly stores longest prefix of some digit string which is suffix of
numbers written so far. This type of automaton has a special name to it and is
called Aho-Corasick.

Unfortunately this is a rather involved DP and it is tough to explain it more
without going into code. While keeping this editorial in back of mind, please
refer to setter/tester solution for more details.

SETTER’S SOLUTION:

Can be found here.

TESTER’S SOLUTION:

Can be found here.

RELATED PROBLEMS:

Round Numbers from USACO 2006, November, Silver division
Topcoder SRM 519, Div 1 Level 2 : You can consult editorial for this problem as well.
V from COCI 2007, Contest #6
CUDAK from COCI 2007, Contest #3.


#2

Similar problems (idea of Aho Corasick) appeared too many times in ACM contests, Some examples:

ACM NEERC NorthWestern Subregional 2004: Word Encoding

World Finals 2009: Password Suspects

ACM ICPC Jakarta 2010: Serial Numbers

ACM ICPC Dalian 2011: Rescue The Rabbits


#3

Do you know about other problems that can be solved using Aho Corasick? not necessarily for DP like those above


#4

Actually I used the idea only for multiple string matching and those kind of DPs. But there is a thread in TC about Aho-Corasick Problems: http://apps.topcoder.com/forums/?module=Thread&threadID=591940&start=0&mc=6#883430

Matrix Matcher and WPUZZLES are kind of problems that I haven’t tried yet.

Btw, congratulations on your excellent performance in this contest!