SUBSEQII - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: raysh_07
Tester: mridulahi
Editorialist: iceknight1093

DIFFICULTY:

TBD

PREREQUISITES:

Segment tree/fenwick tree

PROBLEM:

For an array B of length M, consider the following process:

  • For i = 1, 2, \ldots, M-1, M-1 in order, choose j\gt i and set B_j := \max(0, B_j - B_i).
    That is, choose some index j\gt i and subtract B_i from it, not going below 0.

f(B) is the maximum possible value of the sum of B after this process.
You’re given an array A. Find the sum of f(S) across all non-empty subsequences S of A.

EXPLANATION:

Just as in SUBSEQI, we first try to compute f(A) for a fixed A.

To do this, we analyze the very last move made, which is forced to be B_N := \max(0, B_N - B_{N-1}).
Let x_0 be the initial value of B_{N-1} (initial meaning before any subtraction moves are made at all), and y_0 be the initial value of B_N.
Observe that, just before the final move:

  • If B_{N-1} \geq B_N, then B_N will be set to 0.
    We incurred a loss of at least y_0 (everything at index N is gone).
  • If B_{N-1} \lt B_N, B_N won’t be set to 0; however we still incurred a loss of at least x_0.
    This is because (x_0 - B_{N-1}) was lost before the final move (subtracted from index N-1) and the final move loses the last B_{N-1} (subtracted from index N).

So, no matter what, our loss is \geq \min(x_0, y_0).
It’s not hard to see that it’s always possible to achieve exactly this loss.

Proof

Suppose x_0 \geq y_0, i.e, B_N is initially the smaller element among the two.
Then, simply make every move subtract from B_N, and it’ll be 0 in the end with all the other elements unchanged.

Conversely, if x_0\lt y_0, make every move a subtraction from B_{N-1} instead.
The final move is the forced subtraction of B_{N-1} from B_N, which brings the overall loss to exactly x_0.

That is, for array A of length N, f(A) = sum(A) - \min(A_N, A_{N-1}).
This works for N = 1 as well, if A_0 is treated as 0.


Now, we need to compute the above quantity across all subsequences S of A.
First, compute the sum of all subsequences of A, which is trivially

\sum_{i=1}^N A_i \cdot 2^{N-1}

We’ll then subtract the loss of each subsequence from this.

Consider some index i. There are two possibilities for this A_i to contribute to the loss of a subsequence

  • First, index i can be the second-last element of the subsequence, and the last element can be something strictly greater than it.
    If there are k_i elements strictly greater than A_i after it, the number of such subsequences is 2^{i-1} \cdot k_i, since we’re free to choose any subsequence of elements before i.
    Computing k_i quickly is a standard problem, since it’s essentially asking “how many elements are \gt A_i in this suffix?”, and is solvable using a segment tree, for example.
  • Second, index i can be the last element of the subsequence, and the second-last element should be something not smaller than it.
    In other words, we want the number of subsequences ending before index i, whose last element is \geq A_i.

Here’s one way to compute the latter quickly.
Let \text{dp}[y] denote the number of subsequences whose last element is y. Initially, this is filled with zeros.
Then, when processing A_i, do the following:

  • Compute \text{dp}[A_i] + \text{dp}[A_i + 1] + \ldots, which is the value we’re looking for (number of subsequences so far whose last element is \geq A_i
  • Then, add 2^{i-1} to \text{dp}[A_i], to account for all the subsequences ending at index i.

We want point additions and range sums, which a segment tree or fenwick tree can handle fast.
Note that the A_i values can be quite large, so you’d need to either use an implicit segment tree or compress the values first.

TIME COMPLEXITY:

\mathcal{O}(N\log N) per testcase.

CODE:

Author's code (C++)
#include <bits/stdc++.h>
using namespace std;
#define int long long
#define INF (int)1e18
#define f first
#define s second

mt19937_64 RNG(chrono::steady_clock::now().time_since_epoch().count());

int n;
const int N = 2e5 + 69;
const int mod = 1e9 + 7;
int f[N], a[N], p2[N], b[N];

void init(){    
    for (int i = 1; i <= n; i++) f[i] = 0;
}

void upd(int x, int v){
    for (int i = x; i <= n; i += i & (-i)){
        f[i] += v;
        f[i] %= mod;
    }
}

int query(int x){
    int res = 0;
    for (int i = x; i; i -= i & (-i)){
        res += f[i];
        res %= mod;
    }

    return res;
}

void Solve() 
{
    cin >> n;

    for (int i = 1; i <= n; i++){
        cin >> a[i];
    }

    p2[0] = 1;
    for (int i = 1; i <= n; i++) p2[i] = p2[i - 1] * 2 % mod;

    set <int> st;
    map <int, int> mp;
    for (int i = 1; i <= n; i++) st.insert(a[i]);
    int pr = 0;
    for (auto x : st){
        mp[x] = ++pr;
    }

    for (int i = 1; i <= n; i++){
        b[i] = mp[a[i]];
    }

    int ans = 0;
    for (int i = 1; i <= n; i++){
        ans += a[i] * p2[n - 1] % mod;
        ans %= mod;
    }

    init();

    for (int i = 1; i <= n; i++){
        // a[n] is min 
        // query >= stuff for a[n - 1] 
        // contribution will be 2 ^ (i - 1) 

        ans -= a[i] * (query(n) - query(b[i] - 1));
        ans %= mod;
        if (ans < 0) ans += mod;

        upd(b[i], p2[i - 1]);
    }

    init();

    for (int i = n; i >= 1; i--){
        // a[n - 1] is min 
        // query > stuff for a[n] 

        ans -= a[i] * (query(n) - query(b[i])) % mod * p2[i - 1];
        ans %= mod;
        if (ans < 0) ans += mod;

        upd(b[i], 1);
    }   

    cout << ans << "\n";
}

int32_t main() 
{
    auto begin = std::chrono::high_resolution_clock::now();
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    int t = 1;
    // freopen("in",  "r", stdin);
    // freopen("out", "w", stdout);
    
    cin >> t;
    for(int i = 1; i <= t; i++) 
    {
        //cout << "Case #" << i << ": ";
        Solve();
    }
    auto end = std::chrono::high_resolution_clock::now();
    auto elapsed = std::chrono::duration_cast<std::chrono::nanoseconds>(end - begin);
    cerr << "Time measured: " << elapsed.count() * 1e-9 << " seconds.\n"; 
    return 0;
}
Tester's code (C++)
#include<bits/stdc++.h>

using namespace std;

#define all(x) begin(x), end(x)
#define sz(x) static_cast<int>((x).size())
#define int long long

const int mod = 1e9 + 7;

struct SegtreeSum {

        int size;
        vector<int> sums;
        void init(int n) {
                size = 1;
                while (size < n) size *= 2;
                sums.assign(2 * size - 1, 0);
        }        
 
        void build (vector<int> &a, int x, int lx, int rx) {
                if (rx - lx == 1) {
                        if (lx < sz(a)) sums[x] = a[lx];
                        else sums[x] = 0;
                        return;
                }
                int m = (lx + rx) / 2;
                build(a, 2 * x + 1, lx, m);
                build(a, 2 * x + 2, m, rx);
                sums[x] = sums[2 * x + 1] + sums[2 * x + 2];
        }

        void build (vector<int> &a) {
                build(a, 0, 0, size);
        }
 
        void set (int i, int v, int x, int lx, int rx) {
                if (rx - lx == 1) {
                        sums[x] = v;
                        return;
                }
                int m = (lx + rx)/ 2;
                if (i < m) set(i, v, 2 * x + 1, lx, m);
                else set(i, v, 2 * x + 2, m, rx);
                sums[x] = sums[2 * x + 1] + sums[2 * x + 2];
        }

        void set (int i, int v) {
                set(i, v, 0, 0, size);
        }
 
        int sum (int l, int r, int x, int lx, int rx) {                
                if (lx >= r || l >= rx) return 0;
                if (rx - lx == 1) return sums[x];
                if (lx >= l && rx <= r) return sums[x];
                int m = (lx + rx) / 2;
                int s1 = sum(l, r, 2 * x + 1, lx, m);
                int s2 = sum(l, r, 2 * x + 2, m, rx);
                return s1 + s2;                
        }
 
        int sum (int l, int r) {
                if (r <= l) return 0;
                return sum(l, r, 0, 0, size);
        }
 
};


signed main() {

        ios::sync_with_stdio(0);
        cin.tie(0);
        cout.tie(0);

        int t;
        cin >> t;

        while (t--) {

                int n;
                cin >> n;
                int a[n];
                array<int, 2> b[n];
                for (auto &x : a) cin >> x;
                for (int i = 0; i < n; i++) b[i] = {a[i], i};

                sort(b, b + n);
                vector<int> v(n), v2(n);
                int pos[n];
                for (int i = 0; i < n; i++) {
                        v[i] = 1;
                        v2[i] = b[i][0];
                        pos[b[i][1]] = i;
                }
                SegtreeSum seg;
                SegtreeSum seg2;
                seg.init(n);
                seg.build(v);
                seg2.init(n);
                seg2.build(v2);

                int cn = 0;
                int c2 = 1;
                int ans = accumulate(a, a + n, 0ll);

                for (int i = 0; i < n; i++) {
                    
                        int x = a[i];
                        int less = seg.sum(0, pos[i]);
                        ans += (cn * less + x * c2 % mod * less) % mod; 
                        
                        int greatr = seg.sum(pos[i] + 1, n);
                        int greatrsum = seg2.sum(pos[i] + 1, n) % mod;
                        ans += (cn * greatr + greatrsum * c2) % mod;
                        ans %= mod;
                        
                        seg.set(pos[i], 0);
                        seg2.set(pos[i], 0);
                        cn = (2 * cn + x * c2) % mod;
                        c2 = 2 * c2 % mod;
                        
                }

                cout << ans << "\n";
                

        }
        

        
}