PAIRMAKING - Editorial

iceknight1093 · January 29, 2025, 4:30pm

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author:
Tester: sushil2006
Editorialist: iceknight1093

DIFFICULTY:

Medium

PREREQUISITES:

Dynamic programming

PROBLEM:

You’re given an array A of length N, and an odd integer M.
Define f(A) to be the number of unordered pairs (i, j) such that A_i + A_j = M.

In one move, you can replace any element of A with an integer of your choice. Find the minimum number of operations needed to make f(A) \geq K.

EXPLANATION:

Let \text{freq}[x] denote the frequency of element x, i.e. the number of times x appears in A.
Observe that we have

f(A) = \sum_{x=1}^{\frac{M-1}{2}} \text{freq}[x] \cdot \text{freq}[M-x]

Each operation essentially allows us to choose two different integers x and y such that \text{freq}[y] \gt 0, and increase \text{freq}[x] by 1 while reducing \text{freq}[y] by 1 (corresponding to changing an occurrence of y to x).

Let’s now work with pairs of (x, M-x) only.

First, we make an observation about structure: in an optimal solution, there will be exactly one pair (x, M-x) such that the frequencies of these two elements will be increased: everything else will see a decrease.
This can be proved with a simple exchange argument: if there are multiple pairs with frequency increases, instead move all other increases to the pair with the largest value of \max(\text{freq}[x], \text{freq}[M-x]), and the answer doesn’t worsen.

Next, let’s look at a specific pair (x, M-x), suppose with frequencies (a, b) with a \geq b.
Then,

Suppose we’re adding to these frequencies. Then, it’s optimal to repeatedly add to b till it reaches a, and then alternate between a and b.
This follows from the fact that (a+1)\cdot (b-1) \lt a\cdot b when a \gt b, so in an optimal situation we’ll never end up in a situation where |a-b| \gt 1 and yet we added to a.
Similarly, suppose we’re subtracting from this pair - it’s then optimal to reduce a till it reaches b, then alternately decrease a and b.
This leads to the minimum “loss” in pairs, and can be proved in similar fashion.

Now, we’ve established that only one pair of elements will have their frequencies increased - though we don’t know yet which pair this is.
Rather than try to figure out which one to choose, we can simply try all of them!

We start with a slow solution first.
Let’s fix the pair (x, M-x), and say we want to perform r frequency increases here.
We already know how best to perform these increases: what’s remaining is to find the optimal way to perform r decreases from the other elements.

One way to do this is with dynamic programming.
Let’s define dp[i][j] to be the minimum possible decrease in the answer, if we’ve performed j removals from the first i pairs (of course, excluding (x, M-x)).
Transitions are quite natural: to compute dp[i][j], look at all possible decreases of the i-th pair (say y), and update dp[i][j] with \min(dp[i][j], dp[i-1][j-y] + rem[i][y]), where rem[i][y] is the cost of performing y removals here.

The major issue with this approach is complexity: there are already O(NM) ways to fix the pair (x, M-x) and value r; and then for each one we have a DP that looks to be O(MN^2), having O(NM) states with O(N) transitions from each.

However, with some observations, this can be optimized.

Looking at the transitions of the DP, we see that y only needs to be iterated from 0 to the sum of frequency of elements in the pair - in particular, for a fixed j, the total number of iterations across all i will simply equal the sum of frequencies plus the number of pairs, which is N + M/2.
So, the DP really takes O(N^2 + NM) time, shaving a linear factor off already.
The DP table doesn’t really need to be recomputed every single time - note that it’s completely independent of r, so once (x, M-x) is fixed, the DP table only needs to be computed once, and can then be reused for all r.
This brings the complexity to O(NM(N+M)) which is good enough for the constraints (note that M is itself O(N)).

There is, however, one detail missed by the above solution - the fact that it’s possible perform moves within a pair.
For instance, if a pair has frequencies (100, 0), it might be beneficial to make a few moves within it, to transfer some frequency from the higher one to the lower one and obtain say (75, 25).
Note that this doesn’t contradict our initial observation of “only one pair will have additions done to its frequency”.

The solution so far doesn’t account for this at all, so let’s attempt to throw it in.

We fix the pair (x, M-x), and then compute the DP for removals of all other pairs.
Then, fix the number of self-moves made, say s.
Then, fix the number of removals from other pairs, r.
Finally, update the answer with the current count of pairs - which can be computed using a bit of math and looking up the DP table.

This takes O(N^3) time, considering that M is O(N).

Note that there are other ways to implement this to reach cubic complexity.
For example, instead of re-doing the DP O(N) times, it’s also possible to do it just twice: once for the prefix and once for the suffix.
It then takes O(N) time to combine the prefix and suffix DPs for when we want to look at a specific (x, M-x) pair - but the complexity ends up being the same.

TIME COMPLEXITY:

\mathcal{O}(N^3) per testcase.

CODE:

Author's code (C++)

#include <bits/stdc++.h>
using namespace std;
#define int long long
#define INF (int)1e18

mt19937_64 RNG(chrono::steady_clock::now().time_since_epoch().count());

void Solve() 
{
    int n, m, need; cin >> n >> m >> need;
    
    int k = m / 2;
    vector <int> a(k + 2, 0), b(k + 2, 0);
    for (int i = 1; i <= n; i++){
        int x; cin >> x;
        if (x <= m / 2){
            a[x]++;
        } else {
            b[m - x]++;
        }
    }
    
    vector<vector<int>> pref(k + 2, vector<int>(n + 2, INF));
    vector<vector<int>> suf(k + 2, vector<int>(n + 2, INF));
    pref[0][0] = 0;
    suf[k + 1][0] = 0;
    
    for (int i = 1; i <= k; i++){
        pref[i] = pref[i - 1];
        for (int j = 0; j <= n; j++) if (pref[i - 1][j] != INF){
            int c = a[i], d = b[i], lost = 0;
            int go = c + d;
            
            for (int k = 0; k <= go; k++){
                pref[i][j + k] = min(pref[i][j + k], pref[i - 1][j] + lost);
                
                if (c > d){
                    lost += d;
                    c--;
                } else {
                    lost += c;
                    d--;
                }
                
            }
        }
    }
    
    for (int i = k; i >= 1; i--){
        suf[i] = suf[i + 1];
        for (int j = 0; j <= n; j++) if (suf[i + 1][j] != INF){
            int c = a[i], d = b[i], lost = 0;
            int go = c + d;
            
            for (int k = 0; k <= go; k++){
                suf[i][j + k] = min(suf[i][j + k], suf[i + 1][j] + lost);
                
                if (c > d){
                    lost += d;
                    c--;
                } else {
                    lost += c;
                    d--;
                }
            }
        }
    }
    
    vector<vector<int>> dp(k + 2, vector<int>(n + 1, INF));
    for (int i = 1; i <= k; i++){
        for (int j = 0; j <= n; j++) if (pref[i - 1][j] != INF){
            for (int k = 0; k <= n; k++) if (suf[i + 1][k] != INF){
                dp[i][j + k] = min(dp[i][j + k], pref[i - 1][j] + suf[i + 1][k]);
            }
        }
    }
    
    int base = 0;
    for (int i = 1; i <= k; i++){
        base += a[i] * b[i];
    }
    need -= base;
    
    for (int ans = 0; ans <= n; ans++){
        int mx = 0;
        for (int i = 1; i <= k; i++){
            for (int s = 0; s <= ans; s++){
                for (int it = 0; it < 2; it++){
                    swap(a[i], b[i]);
                    
                    int og = a[i] * b[i];
                    int x = a[i] - s, y = b[i] + s;
                    if (x < 0) continue;
                    
                    int have = (ans - s);
                    
                    if (x > y) swap(x, y);
                    if (y >= x + have){
                        x += have;
                    } else {
                        have -= (y - x);
                        x = y;
                        x += have / 2;
                        y += (have + 1) / 2;
                    }
                    
                    int got = x * y;
                    got -= og;
                    got -= dp[i][ans - s];
                    
                    mx = max(mx, got);
                }
            }
        }
        
        if (mx >= need){
            cout << ans << '\n';
            return;
        }
    }
    cout << -1 << "\n";
}

int32_t main() 
{
    auto begin = std::chrono::high_resolution_clock::now();
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    int t = 1;
    // freopen("in",  "r", stdin);
    // freopen("out", "w", stdout);
    
    cin >> t;
    for(int i = 1; i <= t; i++) 
    {
        // cout << "Case #" << i << ": \n";
        Solve();
    }
    auto end = std::chrono::high_resolution_clock::now();
    auto elapsed = std::chrono::duration_cast<std::chrono::nanoseconds>(end - begin);
    cerr << "Time measured: " << elapsed.count() * 1e-9 << " seconds.\n"; 
    return 0;
}

Tester's code (C++)

#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>

using namespace std;
using namespace __gnu_pbds;

template<typename T> using Tree = tree<T, null_type, less<T>, rb_tree_tag, tree_order_statistics_node_update>;
typedef long long int ll;
typedef long double ld;
typedef pair<int,int> pii;
typedef pair<ll,ll> pll;

#define fastio ios_base::sync_with_stdio(false); cin.tie(NULL)
#define pb push_back
#define endl '\n'
#define sz(a) (int)a.size()
#define setbits(x) __builtin_popcountll(x)
#define ff first
#define ss second
#define conts continue
#define ceil2(x,y) ((x+y-1)/(y))
#define all(a) a.begin(), a.end()
#define rall(a) a.rbegin(), a.rend()
#define yes cout << "Yes" << endl
#define no cout << "No" << endl

#define rep(i,n) for(int i = 0; i < n; ++i)
#define rep1(i,n) for(int i = 1; i <= n; ++i)
#define rev(i,s,e) for(int i = s; i >= e; --i)
#define trav(i,a) for(auto &i : a)

template<typename T>
void amin(T &a, T b) {
    a = min(a,b);
}

template<typename T>
void amax(T &a, T b) {
    a = max(a,b);
}

#ifdef LOCAL
#include "debug.h"
#else
#define debug(...) 42
#endif

/*



*/

const int MOD = 1e9 + 7;
const int N = 1e5 + 5;
const int inf1 = int(1e9) + 5;
const ll inf2 = ll(1e18) + 5;

void solve(int test_case){
    ll n,m,k; cin >> n >> m >> k;
    vector<ll> a(n+5);
    rep1(i,n) cin >> a[i];
    vector<ll> cnt(m+5);
    rep1(i,n) cnt[a[i]]++;

    vector<pll> pairs;
    pairs.pb({-1,-1});
    pairs.pb({0,0});
    rep1(i,m){
        if(i >= m-i) break;
        pairs.pb({cnt[i],cnt[m-i]});
    }
    
    // check for 0 ops
    ll siz = sz(pairs)-1;
    {
        ll cnt = 0;
        rep1(i,siz){
            auto [x,y] = pairs[i];
            cnt += x*y;
        }

        if(cnt >= k){
            cout << 0 << endl;
            return;
        }
    }

    ll pdp[siz+5][n+5], sdp[siz+5][n+5];
    memset(pdp,-0x3f,sizeof pdp);
    memset(sdp,-0x3f,sizeof sdp);
    pdp[0][0] = 0, sdp[siz+1][0] = 0;

    rep(i,siz){
        auto [x,y] = pairs[i+1];
        ll s = x+y;
        rep(c,s+1){
            if(c){
                if(x > y) x--;
                else y--;
            }

            rep(j,n+1){
                if(j+c > n) break;
                amax(pdp[i+1][j+c],pdp[i][j]+x*y);
            }
        }
    }

    rev(i,siz+1,2){
        auto [x,y] = pairs[i-1];
        ll s = x+y;
        rep(c,s+1){
            if(c){
                if(x > y) x--;
                else y--;
            }

            rep(j,n+1){
                if(j+c > n) break;
                amax(sdp[i-1][j+c],sdp[i][j]+x*y);
            }
        }
    }

    ll ans = inf2;

    rep1(i,siz){
        // all removed guys moved to the ith pair
        auto [x1,y1] = pairs[i];
        rep(jl,n+1){
            if(jl){
                if(x1 < y1) x1++;
                else y1++;
            }

            ll x2 = x1, y2 = y1;

            rep(jr,n+1){
                if(jr){
                    if(x2 < y2) x2++;
                    else y2++;
                }

                ll val = pdp[i-1][jl]+sdp[i+1][jr]+x2*y2;
                if(val >= k){
                    amin(ans,(ll)jl+jr);
                    conts;
                }
                if(x2 == y2) conts;

                ll d = abs(x2-y2)-1;
                ll lo = 1, hi = ceil2(d,2);
                ll res = -1;

                while(lo <= hi){
                    ll mid = (lo+hi)>>1;
                    // d+(d-2)+...+(d-2*(mid-1))
                    ll add = d*mid-mid*(mid-1);
                    if(val+add >= k){
                        res = mid;
                        hi = mid-1;
                    }
                    else{
                        lo = mid+1;
                    }
                }

                if(res != -1){
                    amin(ans,jl+jr+res);
                }
            }
        }
    }

    if(ans == inf2) ans = -1;
    cout << ans << endl;
}

int main()
{
    fastio;

    int t = 1;
    cin >> t;

    rep1(i, t) {
        solve(i);
    }

    return 0;
}

n0sk1ll · January 29, 2025, 5:18pm

I didn’t use DP in my approach.

Let’s say we have a multiset of values we want to remove, and then change to something else. What should we change each value to? Greedily, find the element with most occurrences and “boost” it.

How to choose the initial multiset? An implementation trick comes in handy. Fix the element x you want to “boost”. Also fix the pair (x,M-x) along with the number of their occurrences. In the remaining array, greedily choose to take the element that decreases f(A) the least. Additionally, spend some operations on x and M-x. “Make them closer to each other”.

To properly implement it, use three for loops:

for fixed_element in range(m/2):
for operation1 in range(n):
for operation2 in range(n):
if valid():
ans=min(ans,operation1+operation2)

iceknight1093 · January 30, 2025, 5:36am

Your greedy is incorrect, unfortunately it seems the tests were weak and didn’t catch it

Here is a generator that hacks your code - the answer to this test is 5, while you output 6.

n0sk1ll · January 30, 2025, 7:27am

I see. The first step about “boosting” the maximum may be correct. But, the greedy fails to find the optimal starting multiset. Probably because it is impossible.

Thank you for the contest. And also thank you for spending time to hack my fake solve. I will study the DP solution and try to upsolve the problem.

(p.s. I suck at DP)