MNDF - Editorial

iceknight1093 · April 17, 2024, 4:30pm

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: hellolad
Tester: watoac2001
Editorialist: iceknight1093

DIFFICULTY:

TBD

PREREQUISITES:

Dynamic programming, binary search

PROBLEM:

You’re given an array A of length N.
For an array S of length N containing only the values +1 and -1, define f(S) to be the difference between the minimum and maximum values of \sum_{j=1}^i S_j\cdot A_j.

Find the minimum possible value of f(S), and the number of arrays that attain this minimum.

EXPLANATION:

First, we focus on finding the minimum value of f(S).

Let M denote the maximum element of A.
Observe that f(S) \geq M for sure, since there’ll be a difference of exactly M between wherever M appears and the previous signed prefix sum.
On the other hand, we can also ensure that the difference never gets too far away from M.
Specifically, we have the following strategy:

Let the current sum be \text{sm}.
If \text{sm}+A_i \leq M, add A_i to \text{sm}.
Otherwise, subtract A_i from \text{sm}.
Note that this can only happen when \text{sm} \gt 0, and we subtract at most M from \text{sm}.

This way, \text{sm} never exceeds M, and also never goes below -M.
So, f(S) \leq 2M.
Now that we have upper and lower bounds, observe that if we’re able to check for a fixed k whether f(S) \leq k is possible, the minimum possible value of f(S) can be found using binary search.

Now, we turn our attention to the check of the binary search: given k, how can we decide whether there exists some S for which f(S) \leq k?

For this, we will use dynamic programming.

A reasonable starting point is to define \text{dp[i][x][y][z]} to be true if there’s a way to assign the first i signs such that:

The minimum signed prefix sum seen so far is x.
The maximum signed prefix sum seen so far is y.
The current signed prefix sum is z.

Transitions are simple, depending on whether you assign S_i = +1 or S_i = -1.
While this is correct, it’s also much too slow: there are \mathcal{O}(NM^3) states which is far too large.

We now try to optimize this.

Optimization 1

Notice that we’re not using k in our DP states at all - so, let’s try to use it and reduce things a bit.
We don’t care about any states where |x-y| \gt k - in fact, we don’t even really care about the values of x and y, as long as their difference is \leq k.

So, we can always just take a ‘window’ of length k, and ensure that the signed prefix sums lie in that window at all times: that’s enough for us.

This allows us to throw out one state in our DP - the maximum.
Let \text{dp[i][x][y]} denote whether we can assign the first i signs such that:

The current signed prefix sum is y;
The minimum signed prefix sum so far is x; and
At every point, the signed prefix sum has been in the interval [x, x+k].

Transitions remain simple.

We now have \mathcal{O}(NM^2) states, which is unfortunately still too much.

Optimization 2

Let’s look at the transitions for our current DP.
We have something like

\text{dp[i][x][y]} = \text{dp[i-1][x][y}-A_i\text{]} \vee \text{dp[i-1][x][y}+A_i\text{]}

with \text{dp[i][x][y]} = 0 if y \lt x or y \gt x+k.

In particular, once again we don’t really care about the precise value of y - we only care about where it is in relation to x.

So, let’s replace x and y in the DP state with the difference (y-x).
That is, define \text{dp[i][x]} to be true if there exists a way to assign the first i signs, so that the resulting prefix sum is exactly x more than the minimum signed prefix sum so far (and we’ve never been more than k away from the minimum, of course).

Transitions are, yet again, simple:

\text{dp}[i][x] = \text{dp}[i-1][x-A_i] \vee \text{dp}[i-1][x+A_i]

with \text{dp}[i][x] = 0 if x \lt 0 or x \gt k.

The only care that needs to be taken now is with the base cases.
Those are \text{dp}[0][x] = 1 for 0 \leq x \leq k, and 0 for any other x.
This is because we start with a sum of 0, and the minimum value can be anything from -k to 0 (so we’re able to be x away from the minimum for each 0 \leq x \leq k).

We’re now down to \mathcal{O}(N\cdot M) states with constant transitions from each, which is fast enough.
It’s also fairly easy to implement this using only \mathcal{O}(M) memory, since only the previous row of the DP table needs to be stored at each step.
This can be further sped up by a factor of 64 using bitsets, though it isn’t necessary to get AC.

With the binary search, we now have the minimum value of f(S) in \mathcal{O}(NM\log M) time.
Let this minimum be k.

Next, we turn our attention to counting the number of sequences that attain this minimum.

That can, once again, be done with dynamic programming - in fact, using almost the same definition we had earlier!
Let \text{dp}[i][x] denote the number of ways to assign the first i signs so that we’re currently x away from the minimum.
Then,

\text{dp}[i][x] = \text{dp}[i-1][x-A_i] + \text{dp}[i-1][x+A_i]

with boundary cases (x \lt 0 and x \gt k) taken care of as before.
Once again, our base cases are \text{dp}[0][x] = 1 for 0 \leq x \leq k.

Finally, the number of ways is the sum of \text{dp}[N][x] across all x.

A note on correctness (and avoiding binary search)

The second DP we used has a funny property: it happens to be correct only because k is the minimum possible value of f(S) (you might notice that the intermediate values don’t always make sense, though the values of \text{dp}[N][x] are correct and those are all we care about).

This is because we started k different sequences at the beginning (recall the base cases), even though there’s really only one sequence (the empty one).
However, any valid way of choosing signs has a unique ‘offset’ between 0 and k such that all its intermediate values fit in our window, and so will be counted only once in the end, which is still fine.

In particular, if we’re running the DP for some value of k, and there’s some way of choosing signs so that the difference between minimum and maximum is strictly less than k, we’ll end up counting it multiple times since there are multiple possible initial offsets that will let it fit in a k-sized window.
This is not a problem for when k is the minimum possible value of f(S), since no such ‘bad’ sequence exists by very virtue of k being minimum.

This observation also allows us to skip the binary search and compute the minimum value of f(S) in \mathcal{O}(NM) time, by slightly changing our DP definition.
You’ll want something like \text{dp}[i][x] to be the minimum possible value of k (i.e, minimum window size) such that after assigning the first i signs, we’re x away from the minimum (and k = \infty if it’s impossible to be x away from the minimum).

The minimum possible value of f(S) is then the smallest k such that \text{dp}[N][k] \neq \infty, and the counting part is similar.
This approach can be seen in the author’s code below.

TIME COMPLEXITY:

\mathcal{O}(N\cdot \max(A)) or \mathcal{O}(N\cdot \max(A)\log\max (A)) per testcase.

CODE:

Author's code (C++)

#include <bits/stdc++.h>
using namespace std;
#define IOS ios_base::sync_with_stdio(0); cin.tie(0); cout.tie(0);
#define ll long long
const ll M=998244353;

int main(){
    IOS
    ll t;
    cin>>t;
    while(t--){
        ll n;
        cin>>n;
        ll m=0;
        vector<ll> a(n+1);
        for(ll i=1;i<=n;++i){
            cin>>a[i];
            m=max(m,a[i]);
        }
        vector<vector<ll>> dp(n+1,vector<ll>(3*m+1,1e9));
        // dp[i][j]  --->  minimum value of max(s) if initial score was j
        // considering only first i elements and making sure that min(s) is non-negative
        for(ll i=0;i<=3*m;++i){
            dp[0][i]=i;
        }
        for(ll i=1;i<=n;++i){
            for(ll j=0;j<=2*m;++j){
                dp[i][j]=dp[i-1][j+a[i]];
                if(a[i]<=j){
                    dp[i][j]=min(dp[i][j],max(j,dp[i-1][j-a[i]]));
                }
            }
        }
        ll ans=1e9;
        for(ll i=0;i<=2*m;++i){
            ans=min(ans,dp[n][i]);
        }
        vector<vector<vector<ll>>> dp2(n+2,vector<vector<ll>>(ans+1,vector<ll>(2)));
        for(ll i=0;i<=ans;++i){
            dp2[n+1][i][1]=1;
        }
        for(ll i=n;i>=1;--i){
            for(ll j=0;j<=ans;++j){
                if(j+a[i]<=ans){
                    dp2[i][j][1]+=dp2[i+1][j+a[i]][1];
                    dp2[i][j][0]+=dp2[i+1][j+a[i]][(j+a[i]==ans)];
                }
                if(j-a[i]>=0){
                    dp2[i][j][1]+=dp2[i+1][j-a[i]][1];
                    dp2[i][j][0]+=dp2[i+1][j-a[i]][0];
                }
                dp2[i][j][0]%=M;
                dp2[i][j][1]%=M;
            }
        }
        ll cnt=0;
        for(ll i=0;i<=ans;++i){
            cnt+=dp2[1][i][(i==ans)];
            cnt%=M;
        }
        cout<<ans<<' '<<cnt<<'\n';
    }
    return 0;
}

Editorialist's code (C++)

#pragma GCC optimize("O3,unroll-loops")
#pragma GCC target("avx2,bmi,bmi2,lzcnt,popcnt")
#include "bits/stdc++.h"
using namespace std;
using ll = long long int;
mt19937_64 RNG(chrono::high_resolution_clock::now().time_since_epoch().count());

int main()
{
    ios::sync_with_stdio(false); cin.tie(0);

    int t; cin >> t;
    while (t--) {
        int n; cin >> n;
        vector<int> a(n);
        for (auto &x : a) cin >> x;
        int mx = *max_element(begin(a), end(a));

        int lo = mx, hi = 2*mx - 1;
        while (lo < hi) {
            int mid = (lo + hi)/2;

            vector<int> dp1(mid+1, 1), dp2(mid+1);
            for (int x : a) {
                dp2.assign(mid+1, 0);
                for (int y = x; y <= mid; ++y) dp2[y] = dp1[y-x];
                for (int y = 0; y+x <= mid; ++y) dp2[y] |= dp1[y+x];
                swap(dp1, dp2);
            }
            if (*max_element(begin(dp1), end(dp1))) hi = mid;
            else lo = mid+1;
        }

        const int mod = 998244353;
        vector<int> dp1(lo+1, 1), dp2(lo+1);
        for (int x : a) {
            dp2.assign(lo+1, 0);
            for (int y = x; y <= lo; ++y) dp2[y] = dp1[y-x];
            for (int y = 0; y+x <= lo; ++y) dp2[y] = (dp2[y] + dp1[y+x]) % mod;
            swap(dp1, dp2);
        }
        cout << lo << ' ' << accumulate(begin(dp1), end(dp1), 0ll) % mod << '\n';
    }
}