PROBLEM LINK:
Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4
Author: biggestotaku
Tester: pols_agyi_pols
Editorialist: iceknight1093
DIFFICULTY:
Medium
PREREQUISITES:
Dynamic programming
PROBLEM:
For a zero-indexed sequence S of length M, define f(S) = \prod_{i=0}^{M-1} (S_i + i)
You are given an array A of length N and a parameter K.
You have an empty array B.
For i = 1, 2, 3, \ldots, N, do the following:
- Insert A_i at the start of B.
- Then, compute the sum of f(S) across all subsequences of B with length at most K.
EXPLANATION:
Suppose we’ve already computed the answer for the array B = [B_1, B_2, \ldots, B_M], and now we want to insert the element X at the start of B to obtain [X, B_1, \ldots, B_M].
Let’s analyze how the answer changes, which will tell us what information we need to maintain.
First, note that for any subsequence that doesn’t contain the newly inserted value X, its score doesn’t change at all.
So, we only need to look at subsequences that do contain the first element.
X is inserted at the start, and so naturally in any subsequence containing it, it will be the first element.
This means it will always have a contribution of (X+0) = X to the product of values under consideration, since we use 0-indexing.
As for the remainder of the subsequence, their values will be increased by 1, 2, 3, \ldots in order of choosing from left to right - in particular, note that the next element will surely have its value increased by 1.
So, if we let S denote the sum of products of all subsequences of length at most K-1, such that we’ve added the values 1 ,2, 3, \ldots to the elements from left to right before multiplying, note that the value we’re looking for is exactly X\cdot S.
This is because each such subsequence can be extended to a valid one by inserting X at its start.
This brings us to the next question: how do we find the value S (or rather, keep it updated)?
Similar logic tells us that to do this, we’ll need to know the value S_2, which denotes the sum of products of subsequences of length at most K-2, such that we add 2, 3, 4, \ldots to the values of the subsequence.
This, of course, holds for higher values as well.
Let’s formalize the observations we made.
Define dp(i, j) to be the sum of products of values of all subsequences such that:
- We’ve placed i elements into B;
- The elements of the subsequence have been increased by j, j+1, j+2, \ldots from left to right; and
- The subsequences considered have length at most K-j (otherwise they can’t be extended to a subsequence of length \le K that has 0 added to the first element.)
Note that to ensure this, we only need to ensure that the last element of the subsequence has a value \lt K added to it; which is naturally satisfied by only considering states with j \lt K.
Note that with this definition, the value we’re looking for is exactly dp(i, 0) for every i.
To update these values with the newly inserted value X = A_i, we simply have
The dp(i-1, j) part is from older subsequences, then we have (A_i + j) \cdot dp(i-1, j+1) to account for inserting A_i+j at the start of an existing subsequence that begins with adding j+1; as well as an additional single copy of (A_i + j) itself corresponding to choosing the subsequence of length 1, i.e. the first element alone.
Since we only need to care about states with j \lt K as noted above, this DP has N\cdot K states with \mathcal{O}(1) transitions from each, for \mathcal{O}(N\cdot K) time overall.
This is fast enough for the given constraints of N \le 10^5 and K \le 500.
TIME COMPLEXITY:
\mathcal{O}(N\cdot K) per testcase.
CODE:
Editorialist's code (C++)
// #include <bits/allocator.h>
// #pragma GCC optimize("O3,unroll-loops")
// #pragma GCC target("avx2,bmi,bmi2,lzcnt,popcnt")
#include "bits/stdc++.h"
using namespace std;
using ll = long long int;
mt19937_64 RNG(chrono::high_resolution_clock::now().time_since_epoch().count());
int main()
{
ios::sync_with_stdio(false); cin.tie(0);
const int mod = 998244353;
int t; cin >> t;
while (t--) {
int n, k; cin >> n >> k;
vector a(n, 0);
for (int &x : a) cin >> x;
array<int, 502> dp{};
for (int x : a) {
for (int i = 0; i < k; ++i) {
int y = x + i;
dp[i] = (dp[i] + 1ll*y*(1 + dp[i+1])) % mod;
}
cout << (dp[0]+1) % mod << '\n';
}
}
}