GASOLINE7 - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: raysh07
Tester: sushil2006
Editorialist: iceknight1093

DIFFICULTY:

TBD

PREREQUISITES:

Some combination of stacks, binary search, segment trees; though no one is strictly necessary

PROBLEM:

For an array A, define f(A, K) as the minimum cost to reach N+1 starting from 1 under the following conditions:

  • You start with 0 fuel.
  • When at i, you can buy 1 unit of fuel at i for cost A_i.
  • You can hold at most K units of fuel.
  • You must hold at least 1 unit of fuel when moving from i to i+1.
    1 unit of fuel will be consumed when making this move.

Given an array A, compute the sum of f(A[L, R], K) across all subarrays A[L, R] of A.

EXPLANATION:

Let’s first understand how to compute f(A, K) for a fixed array A.

One way to analyze this is to look at each part of the path independently.
That is, consider moving from i to i+1.
This requires one unit of fuel.
This unit of fuel has to come from somewhere - and we can further say it must come from one of the indices i, i-1, i-2,\ldots, i-K+1, i.e. within the last K indices of i.
To see why: note that we can safely assume that a unit of fuel that’s bought earlier also gets used up earlier (since units are identical once bought anyway); so when making the i \to i+1 move, we’ll definitely have used up any fuel from before i-K+1 (since we’ll have at most K-1 units of fuel left when entering i-K+1 in the first place.)

Clearly, the optimal choice is then to choose the lowest cost among these indices, i.e.
\min(A_i, A_{i-1}, \ldots, A_{i-K+1}).
More precisely, the last index must be \max(1, i-K+1), to account for i \lt K as well.

Thus, we have

f(A, K) = \sum_{i=1}^N \min(A_i, A_{i-1}, \ldots, A_{\max(1, i-K+1)})

Observe that this sum can be split into two independent parts:

  • For every prefix of A of length \lt K, we add its minimum to the cost once.
  • For every subarray of A of length exactly K, we add its minimum to the cost once.

To compute the sum of answers across all A[L, R], we can thus compute the contributions of each of these parts separately.


First, let’s deal with the subarrays of length K.

Consider the subarray [i, i+K-1] of A.
This subarray will appear in every A[L, R] such that L \le i and R \ge i+K-1, of which there are
i \cdot (N - (i + K - 1) + 1) choices.
For each such appearance, it will contribute \min(A_i, A_{i+1}, \ldots, A_{i+K-1}).

So, knowing the minimums of each length-K subarray of A will allow us to compute the overall contribution of them all in linear time afterwards.

Computing all these minimums quickly is fairly standard:

  • The sliding window minimum problem has a well-known linear-time solution using a stack.
    See here for example.
  • Alternately, a RMQ data structure (segment tree/sparse table) allows for quick querying of each subarray in at worst logarithmic time.

In any case, this part is fairly standard as long as you know at least one way of computing the minimums of all size-K windows.


Next, we need to deal with the shorter subarrays.
Specifically, for A[L, R], we’ll add

\min(A_L) + \min(A_L, A_{L+1}) + \ldots + \min(A_L, \ldots, A_{L+K-2})

Turning this around, observe that \min(A_i, \ldots, A_j) will be added exactly once for each R \ge j (since L = i must be fixed), for an overall multiplier of (N-j+1).

So, if we define S_i to be the sum of minimums of all subarrays with length less than K ending at i, the overall contribution of this part is simply

\sum_{i=1}^N S_i \cdot (N-i+1)

We thus only need to compute the S_i values.

There are several ways to do this, here’s one of them.

We iterate i = 1, 2, 3, \ldots, N in order.
While doing this, we also maintain an array M, such that M_j = \min(A_j, A_{j+1}, \ldots, A_i) i.e. the minimum of the subarray starting at j and ending at i.
Note that if we’re able to keep M updated, we simply have
S_i = M_i + M_{i-1} + \ldots + M_{i-K+2}.

Suppose we’ve finished processing index i-1, and now want to process index i.
Let’s see how the array M changes.

Let s \lt i be the largest index such that A_s \le A_i, i.e. the nearest smaller element to A_i.
Then,

  • For j \le s, M_j will stay unchanged.
  • For s \lt j \le i, we’ll have M_j = A_i instead.

So, M can be kept updated with a simple range-set operation.
Once this is done, we also need the sum of some range of elements of M.
Both of these operations can be handled quickly (in \mathcal{O}(\log N)) by a segment tree with lazy propagation.
Thus, all the S_i values can be computed in \mathcal{O}(N\log N) overall, after which we add the sum \sum_{i=1}^N S_i \cdot (N-i+1) to the answer.


Segment trees with lazy propagation aren’t strictly required: there are other approaches too, utilizing the fact that the M array of subarray minimums ending at i is non-decreasing.

This allows for algorithms where, for example, we keep segments of equal elements and update them appropriately with A_i - either using binary search or even in amortized linear time using a stack.

TIME COMPLEXITY:

\mathcal{O}(N) or \mathcal{O}(N\log N) per testcase.

CODE:

Editorialist's code (C++)
#include <bits/allocator.h>
#pragma GCC optimize("O3,unroll-loops")
#pragma GCC target("avx2,bmi,bmi2,lzcnt,popcnt")
#include "bits/stdc++.h"
using namespace std;
using ll = long long int;
mt19937_64 RNG(chrono::high_resolution_clock::now().time_since_epoch().count());

template<class T>
struct SparseTable {
    T f(T a, T b) {return min(a, b);}
    vector<vector<T>> jmp;
    SparseTable(const vector<T>& V) : jmp(1, V) {
        for (int pw = 1, k = 1; pw * 2 <= (int)V.size(); pw *= 2, ++k) {
            jmp.emplace_back(V.size() - pw * 2 + 1);
            for (int j = 0; j < (int)jmp[k].size(); ++j)
                jmp[k][j] = f(jmp[k - 1][j], jmp[k - 1][j + pw]);
        }
    }
    T query(int a, int b) {
        assert(a < b); // or return unit if a == b
        int dep = 31 - __builtin_clz(b - a);
        return f(jmp[dep][a], jmp[dep][b - (1 << dep)]);
    }
};

struct Node {
    using T = ll;
    T unit = 0;
    T f(T a, T b) { return a+b; }
 
    Node *l = 0, *r = 0;
    int lo, hi;
    T mset = 0;
    T val = unit;
    Node(int _lo,int _hi):lo(_lo),hi(_hi){}
    T query(int L, int R) {
        if (R <= lo || hi <= L) return unit;
        if (L <= lo && hi <= R) return val;
        push();
        return f(l->query(L, R), r->query(L, R));
    }
    void set(int L, int R, T x) {
        if (R <= lo || hi <= L) return;
        if (L <= lo && hi <= R) {
            mset = x;
            val = (hi-lo)*x;
        }
        else {
            push(), l->set(L, R, x), r->set(L, R, x);
            val = f(l->val, r->val);
        }
    }
    void push() {
        if (!l) {
            int mid = lo + (hi - lo)/2;
            l = new Node(lo, mid); r = new Node(mid, hi);
        }
        if (mset)
            l->set(lo,hi,mset), r->set(lo,hi,mset), mset = 0;
    }
};

int main()
{
    ios::sync_with_stdio(false); cin.tie(0);

    int t; cin >> t;
    while (t--) {
        int n, k; cin >> n >> k;
        vector a(n, 0);
        for (int &x : a) cin >> x;

        
        const int mod = 998244353;
        ll ans = 0;
        SparseTable ST(a);
        for (int i = 0; i+k <= n; ++i) {
            int mn = ST.query(i, i+k);
            ans += 1ll*mn*(i+1)*(n - (i+k-1));
            ans %= mod;
        }

        Node *seg = new Node(0, n);
        vector<int> stk;
        for (int i = 0; i < n; ++i) {
            while (!stk.empty()) {
                int u = stk.back();
                if (a[u] > a[i]) stk.pop_back();
                else break;
            }
            int j = 0;
            
            if (!stk.empty()) {
                j = stk.back() + 1;
            }
            j = max(j, i-k+2);
            seg -> set(j, i+1, a[i]);
            ans += seg -> query(max(0, i-k+2), i+1) * (n - i) % mod;

            stk.push_back(i);
        }
        cout << ans % mod << '\n';
    }
}