SORTSUB7 - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: raysh07
Tester: sushil2006
Editorialist: iceknight1093

DIFFICULTY:

Medium

PREREQUISITES:

Stacks or segment trees, binary search

PROBLEM:

An array B is good if it can be converted to a strictly increasing array by performing the following operation several times:

  • Choose an index i and a positive integer X, and replace B_i by B_i\bmod X.

Given an array A, count the number of its subarrays that are good.
Here, N \leq 2\cdot 10^5.

EXPLANATION:

Recall the solution to the easy version: solving for a single array has a simple linear greedy solution by observing that A_i can be either left alone, or converted to any integer in \left[0, \left\lceil\frac{A_i}{2}\right\rceil - 1\right].

Applying this to each subarray results in an \mathcal{O}(N^3) algorithm, which we now need to optimize.


First, note that if an array is good, then every prefix of that array is also good; and if an array is not good, then any array that it is a prefix of cannot be good either.

This means, if we fix L to be the left endpoint of the subarray, the set of right endpoints R such that A[L\ldots R] is good will be a contiguous segment starting at L.
In particular, this means we only need to know the rightmost valid right endpoint for L.

With this in mind, let’s define R_i to be the largest index such that A[i\ldots R_i] is good.
Our goal now shifts to computing all the values of R_i.


To do this, let’s analyze the greedy algorithm a bit closer, or more specifically how exactly it modifies some array B.
What it will end up doing is the following:

  • Reduce some prefix of B to the values 0, 1, 2, 3, \ldots
  • Some index j_1 will be the first such that B_{j_1} isn’t reduced.
    The values from here will be B_{j_1}, B_{j_1}+1, B_{j_1}+2, \ldots for a while.
  • Once again there will be an index j_2 which isn’t reduced, resulting in the values being B_{j_2}, B_{j_2}+1, \ldots for a while, and so on.

In terms of what this means for subarrays of A: let’s fix i to be the starting index; then we’d need to find the first j_1 \gt i such that A_{j_1} cannot be reduced and must remain the same.
However, if it isn’t being modified, what happens from j_1 onwards is independent of i.

So, let’s define dp_i to be the largest integer k such that A[i \ldots k] can be made strictly increasing, without changing A_i.

Assuming we’re able to compute all the dp_i values, they can then be used to compute the R_i values as follows:

  • Find (somehow) the first index k \gt i such that A_j cannot be reduced.
  • The values before k are now 0, 1, 2, \ldots, k-i-1.
  • If A_k \leq k-i-1, longer subarrays won’t work.
    This means we have R_i = k-1.
  • Otherwise, some longer subarrays will work: in particular, any subarray starting at k that can be made increasing, while not changing A_k itself.
    This is dp_k by definition, so we have R_i = dp_k.

Let’s first figure out how to compute, for a fixed i, the first index k \gt i which can’t be reduced. After all, this is central to our strategy.

For this to happen, for each j in [i, k-1], A_j should be able to be reduced to j-i.
This can happen if and only if \left\lceil \frac{A_j}{2} \right\rceil - 1 \geq j - i.
This can be rewritten to \left\lceil \frac{A_j}{2} \right\rceil - 1 - j\geq - i which separates i from j.

So, if we define a new array B with B_i = \left\lceil \frac{A_i}{2} \right\rceil - 1, we’re looking for the first index k\geq i such that B_k \lt -i.
This is a standard problem, and can be solved in a few different ways:

  • One way is to use a monotone stack, just as in the “next smaller element” problem.
  • Alternately you could use a segment tree, with the complexity being either \mathcal{O}(\log N) or \mathcal{O}(\log^2 N) depending on whether you walk down the tree or just binary search and query.

Either way, we’re able to compute the necessary index k \gt i quickly enough now.
So, if we know all the dp_i values, we’ll be able to compute all the R_i values and hence solve the problem.


Finally, let’s look at computing the dp_i values.

It’s not hard to see that this is really pretty much the same thing as before: if i is fixed, we’ll have A_i, A_i+1, A_i+2, \ldots for a while, and if k is the first index that doesn’t follow this, either dp_i = k-1 or dp_i = dp_k, depending on how A_k compares to A_i + k-i-1.

In particular, if we consider the array B defined earlier, the same analysis shows that we’re looking for the first index k \gt i such that B_k \lt A_i - i.
This is pretty much the exact same problem, just with a different query value – and so is solved in the exact same way too.

So, we’re able to compute all the dp_i values quickly, and using them all the R_i values.
The final answer is just the sum of R_i - i + 1 across all i.

TIME COMPLEXITY:

\mathcal{O}(N) \sim \mathcal{O}(N\log^2 N) per testcase.

CODE:

Editorialist's code (C++)
// #include <bits/allocator.h>
// #pragma GCC optimize("O3,unroll-loops")
// #pragma GCC target("avx2,bmi,bmi2,lzcnt,popcnt")
#include "bits/stdc++.h"
using namespace std;
using ll = long long int;
mt19937_64 RNG(chrono::high_resolution_clock::now().time_since_epoch().count());

int main()
{
    ios::sync_with_stdio(false); cin.tie(0);

    int t; cin >> t;
    while (t--) {
        int n; cin >> n;
        vector a(n, 0);
        for (int &x : a) cin >> x;

        vector dp1(n, 0), dp2(n, 0);
        vector<array<int, 2>> stk;
        for (int i = n-1; i >= 0; --i) {
            auto it = upper_bound(begin(stk), end(stk), array{a[i] - i, -1});
            if (it == begin(stk)) dp2[i] = n-i;
            else {
                auto [val, j] = *prev(it);
                dp2[i] = j - i;
                if (a[j] >= a[i] + (j - i)) dp2[i] += dp2[j];
            }

            it = upper_bound(begin(stk), end(stk), array{-i, -1});
            if (it == begin(stk)) dp1[i] = n-i;
            else {
                auto [val, j] = *prev(it);
                dp1[i] = j - i;
                if (a[j] >= (j - i)) dp1[i] += dp2[j];
            }
            
            while (!stk.empty()) {
                auto [val, j] = stk.back();
                if (val >= (a[i] + 1)/2 - i - 1) stk.pop_back();
                else break;
            }
            stk.push_back({(a[i] + 1)/2 - i - 1, i});
        }
        cout << accumulate(begin(dp1), end(dp1), 0ll) << '\n';
    }
}

I think the definition of B_i should be \frac{A_i - 1}{2} - 1 - i.

Can anyone please tell more about how to solve it using segment tree?

I used a simple queue to solve this problem.
For a fixed right point R, when left point L moves from R to 0, A_R's final value will first increase continuously from 0 to some value \le \left\lceil \frac{A_R}{2} \right\rceil - 1 ( we call it the first type), then jump to A_R (we call it the second type), then the subarray becomes unsortable.
So, in a queue, we maintain the final value vs. the range of L which makes A_R become this final value. When move from R to R+1, we will repeatly check the last value of the queue. If it is larger than or equal to A_R, throw it. If it is larger than or equal to \left\lceil \frac{A_R}{2} \right\rceil - 1, pop it and insert back A_R, otherwise break loop. Finally we insert \{0,R,R\} to the queue front for single element \{A_R\}. This is similar as a monotonic stack, so time complexity is O(N).
Notice that, the first type values increase 1 when R moves to R+1. But we can’t modify each element in the queue every time because it’s a O(N^2) operation. Instead, we store real\_value - i in queue, and use value+i as real value.
My submission here.