ATLONE - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: raysh07
Tester: sushil2006
Editorialist: iceknight1093

DIFFICULTY:

Medium

PREREQUISITES:

Dynamic programming

PROBLEM:

Define f(A) to be the smallest integer X such that for any i there exists j \ne i such that |A_i - A_j| \le X.
f(A) = 0 if |A| = 1.

Given the set \{1, 2, \ldots, N\}, compute the sum of f(S) across all non-empty subsets S of the set.

EXPLANATION:

Let’s understand what exactly f(A) represents.

First, for convenience let’s sort A, which won’t change the value of f(A).

Now, suppose we want to check if some X is “valid”.
Then, observe that for any index i, either A_i - A_{i-1} \le X or A_{i+1} - A_i \le X must hold.
If both checks fail, then the difference between A_i and any other element is also \gt X, so X is certainly invalid.

The above condition can be combined into \min(A_i - A_{i-1}, A_{i+1} - A_i) \le X.
We want this to hold for every i, and hence it must hold for the largest among the values too.
That is, for X to be valid, it must satisfy

\max_{i} \left(\min(A_i - A_{i-1}, A_{i+1} - A_i) \right) \le X

Here, we treat A_0 = -\infty and A_{k+1} = \infty where k = |A|.

The above perspective also in fact tells us exactly what the value of f(A) is: the smallest X that satisfies that inequality is exactly

\max_{i} \left(\min(A_i - A_{i-1}, A_{i+1} - A_i) \right)

itself.


Now we move to summing up the score across all subsets.

Let’s fix the score to be X, and attempt to count subsets with a score equal to X.
For this to happen, we need to choose the subset in such a way that:

  • For each chosen element, at least one of its neighbors (among the chosen elements) is at most X distance away.
  • There exists at least one element such that one of its neighbors is X away and the other is \ge X away.

Note that if we enforce only the first condition but not the second, we’ll instead be counting subsets whose score is at most X.
First, we do just that.

Our aim is now to count subsets of \{1, 2, \ldots, N\} such that each chosen element has at least one neighbor being \le X distance away.

We will count this using dynamic programming.
Define dp(i, t) to be the number of ways to choose a valid subset from the first i elements, such that element i is chosen in the subset, and

  • t = 0 denotes that the distance between i and the previously chosen element is \gt X.
  • t = 1 denotes that the distance between i and the previously chosen element is \le X.

For transitions, let’s also fix j \lt i to be the previously chosen element in the subset.
Then,

  • If |i-j| \le X, the distance condition is satisfied for both i and j.
    So, we’re free to take any valid subset ending at j, no matter whether j has been satisfied or not.
    Thus, we add dp(j, 0) + dp(j, 1) to dp(i, 1).
  • If |i - j| \gt X, the distance condition for j cannot be satisfied by i.
    So, we must choose a subset that already has the condition satisfied.
    By definition, this is dp(j, 1).

Further, we also add 1 to dp(i, 0) to account for taking the singleton element \{i\} with nothing before it.

This DP runs in \mathcal{O}(N^2) time since we do \mathcal{O}(N) work for each i.
However, it’s quite simple to speed up the transitions.
Observe that:

  • dp(i, 0) will equal the sum of all dp(j, 1) for j \lt i-X.
  • dp(i, 1) will equal one plus the sum of (dp(j, 0) + dp(j, 1)) for all j \ge i-X.

Both of these correspond to certain range sums of the dp(\cdot, 0) and dp(\cdot, 1) arrays, which prefix sums will allow us to find in constant time if we build them as we go.

This allows for the DP to run in linear time, given that X is fixed.


Once we’ve run the DP, note that the number of valid subsets is exactly the sum of dp(i, 1) across all i.
This is because we fix i to be the largest chosen element; and since there’s nothing after i it must be satisfied by the stuff before it.

Now, let c_k denote the number of subsets whose answer is at most k.
As seen above, we’re able to compute any given c_k in linear time.

The number of subsets whose answer is exactly k then equals (c_k - c_{k-1}).
So, for each k, add k\cdot (c_k - c_{k-1}) to the answer.

Do this for each k = 1, 2, 3, \ldots, N-1 and add up everything to obtain the overall answer.

TIME COMPLEXITY:

\mathcal{O}(N^2) per testcase.

CODE:

Editorialist's code (PyPy3)
mod = 998244353
for _ in range(int(input())):
    n = int(input())
    
    ans, prv = 0, 0
    dp = [ (0, 0) for i in range(n+1)]
    for d in range(1, n):
        
        sm = 0
        s1, s2 = 0, 0
        for i in range(1, n+1):
            dp[i] = (s1+1, s2)

            if i-d >= 0:
                s1 = (s1 + dp[i-d][1]) % mod
                s2 = (s2 - dp[i-d][0] - dp[i-d][1]) % mod
            s2 = (s2 + dp[i][0] + dp[i][1]) % mod
            
            sm += dp[i][1]
        
        ans += (sm - prv) * d
        ans %= mod
        prv = sm
    print(ans % mod)