 Editorial - PEAKS

#1

Author: Kanstantsin Sokal
Tester: Pavel Sheftelevich
Editorialist: Pawel Kacprzak

Medium

PREREQUISITES:

Dynamic programming, Segment tree, Fenwick tree

PROBLEM:

In this problem, for a given integer sequence $S$ of length $N$, your task is to count the number of non-empty subsequences of $S$ with exactly $A$ local minimums and exactly $B$ local maximums.

QUICK EXPLANATION:

Use dynamic programming in order to compute the number of non-empty sequences S[1..j] with exactly $a$ local minimums and exactly $b$ local minimums for all $1 \leq j \leq N$, $0 \leq a \leq A$, $0 \leq b \leq B$. Next, speed up the computations using [Segment tree] or [Fenwick tree]. Finally, notice that $A$ and $B$ differ by at most 1, because all elements in $S$ are distinct. Use this fact to optimize the solution further in order to pass the last subtask.

EXPLANATION:

In the first subtask $N$ is at most $20$, so in order to solve the problem, we can simply generate all possible $2^N - 1$ non-empty subsequences of $S$, and for each one, count the number of local minimums and local maximums in it using a linear scan over its element. This approach leads to $O(2^N \cdot N)$ running time for a single test case, which will pass the first subtask if only it's implemented efficiently.

In the second subtask $N$ can be at most $200$. This is of course way too big for any exponential method used in the first subtask to pass. One thing you should notice is that $A$ and $B$ are not so big here - both are not greater than $10$. We can use this fact to come up with the following dynamic programming approach:

Let dp*[a]** be the number of subsequences of S[1..i] ending in S* with exactly a local minimums, exactly b local maximums and the last element smaller than the one before last element. Similarly, let dp*[a]** be the number of subsequences of S[1..i] ending in S* with exactly a local minimums, exactly b local maximums and the last element greater than the one before last element. With these values defined, we can easily show how to compute any dp entry with first index i having all the dp entries for first indexes j < i computed.

Let’s fix index i. We are going to iterate over all indices 1 \leq j < i. For a fixed j, there are two possibilities: either S[j] < S* or S[j] > S*. Since all elements of S are distinct, equality is not possible here.

If S[j] < S*, we can create a subsequence of length 2 using just S[j] and S*, so we add 1 to dp*. Similarly, if S[j] > S*, we add 1 to dp*.

Next, we are going to extend any previously counted subsequences of length at least 2 to subsequences ending with S*. In order to do this, we iterate over all possible values 0 \leq a \leq A and over all possible values 0 \leq b \leq B. Again, there are two cases to consider:

If S[j] < S*, we add value of dp[j][a]** to dp*[a]**. Moreover, if a < A, we add dp[j][a]** to dp*[a + 1]** creating new local minimum at index j.

Analogously, if S[j] > S*, we add value of dp[j][a]** to dp*[a]**. Moreover, if b < B, we add dp[j][a]** to dp*[a][b + 1] creating new local maximum at index j.

In order to compute the final result, we sum up all dp*[A]** and dp*[A]** values over all 1 \leq i \leq N. At the end, if A = 0 and B = 0, we add N to the result in order to count all subsequences of length 1.

Last but not least, remember to do all necessary computations modulo 10^9 + 9.

The total time complexity of this method is O(N^2 \cdot A \cdot B), because we fill O(N \cdot A \cdot B) dp entries, and we fill a single one in O(N) time.

In the third subtask, $A$ and $B$ are still not greater than $10$, but $N$ can be now as big as $5000$. It would be nice to get rid of one $N$ factor from the complexity of solution given for the second subtask. It turns out that we can do this using a very natural an common idea of improving dynamic programming solutions using segment trees (or any similar data structure like Fenwick tree).

In the solution given above, we update a single entry in the dp table in O(N) time. Now, we are going to do this in O(\log N) time. Notice, that in the original dp solution for a fixed i, we iterate over all possible 1 \leq j < i in order to compute the dp entries for index i. There were two cases to consider depending on whether S[j] < S* or S[j] > S*.

Here comes a crucial observation: if we accumulate all the values of dp[j][a]** for all j such that S[j] < S* and do the same with all values dp[j][a]** for such j, we can easily compute the value of dp*[a]** and dp*[a + 1]** using these accumulated sums. Similarly, we can speed up the computation of dp*[a]** and dp*[a][b + 1] if we accumulate all dp[j][a]** for all j such that S[j] > S* and if we do the same with all dp[j][a]** for such j.

Notice that we can maintain these sums using a segment tree with the sum over a contiguous query (or even just prefix and suffix sum queries) and update operation adding a value at the given index. One thing left to do is to compress the input values in order to easily store the segment tree in memory - their absolute values are initially less or equal to 10^9. We can compress them easily to a range 1..N which helps a lot here.

The total time complexity of this approach is O(N \cdot \log N \cdot A \cdot B).

In the last subtask, we still have $N \leq 5000$, but $A$ and $B$ can now be as big as $200$, so the solution used for the second subtask is too slow now and we need a further improvement.