MEANDIFF2 - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: am_aadvik
Tester: raysh07
Editorialist: iceknight1093

DIFFICULTY:

Medium

PREREQUISITES:

Binary search, segment tree/fenwick tree

PROBLEM:

Define f(S) to be the maximum of |x - \text{avg}(S)| across all x \in S.
Given an array A, solve this problem for each prefix of A, denoted A[1, i]:

  • Compute the maximum possible value of f(S) across all subsequences of A[1, i].

EXPLANATION:

Recall from the easy version that solving the problem for a single array had a simple form: either take the minimum element along with some suffix of the largest elements, or take the maximum element with some prefix of the smallest elements.
This gave 2N choices to try, and we were able to just check them all by iterating through the prefix/suffix.

This now needs to be adapted to solve for each prefix of A.
To update our algorithm to solve this version, we need an algorithm that works a bit better than “try everything”.


Let’s look at just one ‘type’ of set: say, the one where we always include the minimum and then choose some suffix of sorted elements to maximize the average.

For the (sorted) array A, let’s look at what exactly happens here.
Let’s start with just the elements A_1 and A_N.
The initial mean is \frac{A_1+A_N}{2}.
Let this be M_N.

If A_{N-1} \gt M_N, then including A_{N-1} into the suffix will only increase the mean (which is good; since our aim here is to maximize the mean given that we’re looking only at distance to A_1).
If A_{N-1} \le M_N, then including A_{N-1} is not helpful - and it’s not hard to see that including any smaller element is not helpful either, since it can only decrease the mean from M_N.

If it’s optimal to include A_{N-1}, then our mean becomes M_{N-1} = \frac{A_1 + A_N + A_{N-1}}{3}.
Once again, we perform the same check with A_{N-2}: if it’s larger than the mean, including it is good; if it’s smaller, including it is bad (and including any further elements is also bad).

Observe that this in fact tells us exactly which suffix to choose as well!
If we define M_i = \frac{A_1 + A_N + A_{N-1} + \ldots + A_i}{N-i+2} to be the mean if we take the suffix starting at i, then the above algorithm can be rephrased as follows:

  • Let i be the largest index such that A_i \le M_{i+1}.
    Then, the optimal suffix to choose is the one starting at i+1.

(Recall that we’re working with the sorted array A here.)
In fact, we can prove something a bit stronger: if i is the largest index such that A_i \le M_{i+1}, then for all j \le i, we’ll have A_j \le M_{j+1} as well!
(This can be proved with simple algebra, using the fact that if you have a set of elements with mean M and insert a new element X to obtain a new mean M', then if M \gt X you’ll have M' \gt X and if M \lt X you’ll have M' \lt X as well.)

That is, the condition of A_i \le M_{i+1} is monotonic with respect to i!


Monotonicity is a very helpful condition, because it allows for the use of binary search.
That is, we now have the following solution for a (sorted) array A:

  • Use binary search to find the largest index i such that A_i \le M_{i+1}.
    Computing M_{i+1} can be done in constant time if we know the appropriate suffix sum.
  • Once this i is found, the answer is then just M_{i+1} - A_1.

This is the solution that we will extend to solving for every prefix.

Now, we certainly don’t have the time to sort and build suffix sums for every prefix of A.
However, we don’t exactly need that: rather, what we need is to just be able to compute the sum of the largest k elements of a given prefix.
That’s a lot more manageable, and can be handled quickly using a segment tree/fenwick tree.

Specifically, we can do the following.
First, for the input array A, compute an array pos, where pos_i denotes the position of A_i if A were to be sorted (if there are repeat values in A, break ties in ascending order of index.)
For example, if A = [5, 7, 2, 7] then pos = [2, 3, 1, 4].

Also maintain an array B of length N, initially filled with zeros.
Now, for each i = 1, 2, 3, \ldots, N,

  • Let x = pos_i.
    Set B_x = A_i.
    This ‘activates’ the element A_i in the sorted array; with inactive elements being zeros.
  • Now, solve the problem for the prefix of length i.
    This requires a binary search to find the optimal suffix size; and each iteration of the binary search requires a query of the form “given k, find the sum of the largest k elements in the prefix”.
  • However, observe that this sum now corresponds to some suffix of B!
    That’s because the largest k elements of the prefix will all be contiguous in B (separated by zeros).
    So, we only need to find the k-th largest element (which is a standard problem), and then find the appropriate suffix sum.

Maintaining B requires a data structure that supports quickly updating a position and returning a suffix sum; which is exactly what a segment tree or fenwick tree can do.

This gives a solution in \mathcal{O}(\log^2 N) per prefix, which is fast enough for the given limits.
It’s possible to optimize this to \mathcal{O}(\log N) per prefix using ‘segment tree descent’ to combine the binary search and queries, though that isn’t needed to get AC.


You may note that the entirety of the above solution was done assuming that we’re dealing only with \text{mean} - \min.
There’s also the case of \max - \text{mean} to consider.
However, that can be solved similarly (in fact, using the exact same code: you can just multiply every element of A by -1 and run the algorithm for min on this array).
After running both, take the best answer for each prefix.

TIME COMPLEXITY:

\mathcal{O}(N\log N) or \mathcal{O}(N\log^2 N) per testcase.

CODE:

Tester's code (C++)
#include <bits/stdc++.h>
using namespace std;
#define int long long
#define INF (int)1e18

mt19937_64 RNG(chrono::steady_clock::now().time_since_epoch().count());

struct FenwickTree{
    int n;
    vector <int> f;
    vector <int> b;

    inline void add(int i, int x){
        b[i] += x;
        for (int j = i; j <= n; j += j & (-j)){
            f[j] += x;
        }
    }

    inline void modify(int i, int x){
        add(i, x - b[i]);
    }

    inline void init(int nn, vector <int> a){
        n = nn;
        if (a.size() == n){
            vector <int> a2;
            a2.push_back(0);
            for (auto x : a) a2.push_back(x);
            a = a2;
        }

        f.resize(n + 1);
        b.resize(n + 1);

        for (int i = 0; i <= n; i++) f[i] = 0, b[i] = 0;

        for (int i = 1; i <= n; i++){
            modify(i, a[i]);
        }
    }

    inline int query(int x){
        int ans = 0;
        for (int i = x; i; i -= i & (-i)){
            ans += f[i];
        }
        return ans;
    }

    inline int query(int l, int r){
        return query(r) - query(l - 1);
    }
};

void Solve() 
{
    int n; cin >> n;
    
    vector <pair<int, int>> a(n + 1);
    vector <int> b(n + 1);
    for (int i = 1; i <= n; i++){
        cin >> b[i];
        a[i].first = b[i];
        a[i].second = i;
    }
    
    sort(a.begin() + 1, a.end());
    
    vector <int> ord(n + 1);
    for (int i = 1; i <= n; i++){
        ord[a[i].second] = i;
    }
    
    set <int> s;
    
    FenwickTree fen;
    vector <int> vv(n);
    fen.init(n, vv);
    
    FenwickTree cnt;
    cnt.init(n, vv);
    
    for (int i = 1; i <= n; i++){
        int index = ord[i];
        s.insert(index);
        
        fen.add(index, b[i]);
        cnt.add(index, 1);
        
        if (i == 1){
            cout << 0 << " \n"[i == n];
            continue;
        }
        
        int l = *s.begin();
        int r = *(--s.end());
        
        int ans = 0;
        
        {
            // ans = (a[1] + a[2] + ... + a[i] + a[r]) / (i + 1) 
            // minimize? 
            
            int lo = l, hi = (r - 1);
            while (lo != hi){
                int mid = (lo + hi + 1) / 2;
                
                auto id = s.upper_bound(mid);
                --id;
                
                int k = *id;
                
                if (k == l){
                    lo = mid;
                    continue;
                }
                
                // is it good to take k 
                int S = fen.query(l, k - 1) + a[r].first;
                int C = cnt.query(l, k - 1) + 1;
                
                if (a[k].first * C < S){
                    lo = mid;
                } else {
                    hi = mid - 1;
                }
            }
            
            int S = fen.query(l, lo) + a[r].first;
            int C = cnt.query(l, lo) + 1;
            
            ans = max(ans, a[r].first - (S / C));
        }
        
        {
            int lo = l + 1, hi = r;
            
            while (lo != hi){
                int mid = (lo + hi) / 2;
                
                auto id = s.lower_bound(mid);
                int k = *id;
                
                if (k == r){
                    hi = mid;
                    continue;
                }
                
                int S = fen.query(k + 1, r) + a[l].first;
                int C = cnt.query(k + 1, r) + 1;
                
                if (a[k].first * C > S){
                    hi = mid;
                } else {
                    lo = mid + 1;
                }
            }
            
            int S = fen.query(lo, r) + a[l].first;
            int C = cnt.query(lo, r) + 1;
            
            ans = max(ans, (S / C) - a[l].first);
        }
        
        cout << ans << " \n"[i == n];
    }
}

int32_t main() 
{
    auto begin = std::chrono::high_resolution_clock::now();
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    int t = 1;
    // freopen("in",  "r", stdin);
    // freopen("out", "w", stdout);
    
    cin >> t;
    for(int i = 1; i <= t; i++) 
    {
        //cout << "Case #" << i << ": ";
        Solve();
    }
    auto end = std::chrono::high_resolution_clock::now();
    auto elapsed = std::chrono::duration_cast<std::chrono::nanoseconds>(end - begin);
    cerr << "Time measured: " << elapsed.count() * 1e-9 << " seconds.\n"; 
    return 0;
}