SUBMEXCNT - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: raysh07
Tester: sushil2006
Editorialist: iceknight1093

DIFFICULTY:

Easy

PREREQUISITES:

Combinatorics, prefix sums

PROBLEM:

For an array A and an integer M, define f(A, M) to be the number of distinct MEX-es across all size M subsets of A.

You’re given an integer N.
For each M = 1, 2, 3, \ldots, N, compute the sum of f(A, M) across all non-decreasing arrays of length N containing elements in [0, N-1].

EXPLANATION:

Let’s fix the parameter M and try to compute the sum of f(A, M) across all valid arrays A.

f(A, M) is the number of distinct MEX-es attained; so to sum this up across all A, we can instead add up the contributions of all the possible MEX-es.
That is, we do the following:

  • Fix the MEX to be K.
  • Count the number of valid arrays for which there exists a size M subset with MEX equal to K.
  • The MEX of an M-element set cannot exceed M, so we sum this up across all 0 \le K \le M.

We now have a sub-problem: with M and K fixed, how many non-decreasing arrays A with elements in [0, N-1] contain some subset of size M with MEX equal to K?

Here, note that dealing with non-decreasing arrays is equivalent to dealing with frequencies of elements.
Let f_x denote the frequency of x.
Thinking in terms of frequencies is helpful here because it gives us quite straightforward conditions to work with:

  • For the MEX to equal K, certainly we must have (at least) one copy each of 0, 1, 2, \ldots, K-1.
    So, we require f_0, f_1, \ldots, f_{K-1} to all be positive.
  • This gives us K elements that certainly must be chosen as part of the subset already.
    (M-K) more elements must be chosen to complete the subset with MEX K; and importantly, none of the chosen elements must equal K.
    This is possible only when f_K, since if there are any more copies of K in the array then by the pigeonhole principle we’ll be forced to take at least one copy of K into the subset.

The two conditions above, dealing with f_0, \ldots, f_K, are together necessary and sufficient for an array to have a size-M subset with MEX equal to K.

Let’s now try to count such arrays - which as noted above is equivalent to counting frequency arrays [f_0, \ldots, f_{N-1}].
Note that we have the following conditions on f:

  1. f_0 + f_1 + \ldots + f_{N-1} = N.
  2. f_i \ge 0 for all i.
  3. f_i \ge 1 for all 0 \le i \lt K.
  4. f_K \le N-M.

The last condition is the hardest one to deal with here - if we just had a target sum and lower bounds on each element, the number of solutions would be easily found with the help of stars and bars.
Luckily, we can still work around the upper bound, by instead converting it to a lower bound.

To do this, let’s first ignore the upper bound entirely, and just count the number of sequences that satisfy the first three conditions.
This is simply the number of non-negative integer solutions to

f_0 + f_1 + \ldots + f_{N-1} = N - K

which stars and bars tells us is equal to \binom{N-K + N - 1}{N - 1} = \binom{2N-K-1}{N-1}.

Next, we subtract out the ‘bad’ configurations, which are all of those that have f_K \ge N-M+1.
This is now a lower bound, so we’re again able to apply stars-and-bars, with the resulting count being \binom{N+M-K-2}{N-1}.

So, with K and M fixed, the value we’re looking for is simply

\binom{2N-K-1}{N-1} - \binom{N+M-K-2}{N-1}

For a fixed value of M, we want to quickly compute the sum

\sum_{K=0}^M \left(\binom{2N-K-1}{N-1} - \binom{N+M-K-2}{N-1}\right)

To do this quickly, note that every term is of the form \binom{x}{N-1}, and we’re interested in the sum of these for some contiguous range of x.
This allows for the use of prefix sums.
Specifically, define P_i = \sum_{0 \le x \le i} \binom{x}{N-1}, and then the answer for M simply becomes

P_{2N - 1} - P_{2N - M - 2} - P_{N+M-2} + P_{N-3}

Binomial coefficients can be computed in constant time after precomputation of factorials and inverse factorials, so the answer for each M can be found in \mathcal{O}(1) time, leading to \mathcal{O}(N) overall.

TIME COMPLEXITY:

\mathcal{O}(N) per testcase.

CODE:

Editorialist's code (PyPy3)
mod = 998244353
fac = [1] + list(range(1, 500005))
for i in range(1, 500005): fac[i] = fac[i-1] * i % mod
inv = fac[:]
inv[-1] = pow(inv[-1], mod-2, mod)
for i in reversed(range(500004)): inv[i] = inv[i+1] * (i+1) % mod
def C(n, r):
    if n < r or r < 0: return 0
    return (fac[n] * inv[r] % mod * inv[n-r]) % mod

for _ in range(int(input())):
    n = int(input())
    pref = [0]*(2*n)
    for i in range(2*n):
        pref[i] = C(i, n-1)
        if i > 0: pref[i] = (pref[i] + pref[i-1]) % mod
    
    ans = [0]*n
    for m in range(1, n+1):
        ans[m-1] = pref[2*n-1] + (pref[n-3] if n-3 >= 0 else 0)
        ans[m-1] -= (pref[2*n-m-2] if 2*n-m-2 >= 0 else 0) + pref[n+m-2]
        ans[m-1] %= mod
    print(*ans)
2 Likes