MORE_INV - Editorial


Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: yash_daga
Tester and Editorialist: iceknight1093




Inversion counting, Segment trees (with lazy propagation)


You’re given an array A. For each 1 \leq i \leq N independently, find the following:

  • If you can change A_i to any real number, what’s the maximum possible number of inversions in A?


Let x_1 \lt x_2 \lt \ldots \lt x_k be the distinct values be present in A.

The main observation required to solve this task is as follows:
Suppose we set A_i = x, then there are only about 2k + 1 ‘interesting’ values of x, namely:

  • One of the x_i; or
  • Something less than x_1 (they’re all equivalent, so we can treat this as x_1 - 1); or
  • Something greater than x_k (once again they’re all equivalent, so we can consider this to be x_k + 1); or
  • Something strictly between x_i and x_{i+1} for some i.
    Once again, we can consider this to just be x_i + 0.5.

Let’s apply this information to solve the problem.

Subtask 1

The easiest subtask allows for pretty much brute-force computation.

As discussed above, there are at worst 2N+1 distinct values of x we care about.
So, for each index i:

  • Set A_i = x for each of the ‘interesting’ x.
  • Compute the number of inversions in the resulting array.
  • Print the maximum across all x.

Even if inversions are computed in \mathcal{O}(N^2) each time, this will run in \mathcal{O}(N^4) time overall which is enough to pass this subtask.

Subtask 2

N is larger now, though still small enough that \mathcal{O}(N^2) or similar can pass.
Earlier, for each i, we set A_i = x and computed the number of inversions in \mathcal{O}(N^2).

Let’s optimize the second part.
When we set A_i = x, notice that inversion pairs that don’t involve index i don’t really change.
So, we don’t really need to recompute the total inversion number each time.

In particular,

  • Let \text{inv} be the number of inversions in the original array A.
    Let B_i be the number of inversions involving index i.
  • When we set A_i = x,
    • Suppose there are y elements before index i that are larger than x.
    • Suppose there are z elements after index i that are smaller than x.
    • Then, the number of inversions in the new array is exactly \text{inv} - B_i + y + z.

Of these quantities, \text{inv} can be precomputed in \mathcal{O}(N^2), and B_i for all i can be computed at the same time (whenever you see i \lt j with A_i \gt A_j, add one to both B_i and B_j).
So, if we were able to quickly compute y and z, we’d have a very quick algorithm.

There are several ways to do this, for example:

  • Let P_i be a sorted list of the elements [A_1, A_2, \ldots, A_i].
    Similarly, let S_i be a sorted list of [A_i, A_{i+1}, \ldots, A_N].
  • Then, when fixing A_i = x, y and z can be computed by binary searching on P_{i-1} and S_{i+1} respectively; to find the number of elements in the prefix/suffix that are larger/smaller than x.

Since N \leq 1000, computing all the P_i and S_i lists using \mathcal{O}(N^2) memory is not an issue.

We’re binary searching \mathcal{O}(N^2) times for a solution in \mathcal{O}(N^2 \log N) overall, which is enough to get AC.
It’s possible write the solution to run in \mathcal{O}(N^2) as well (using two-pointers), but that isn’t necessary to get AC.

Subtask 3

We can no longer independently try every x for each position.
However, we can still maintain enough information for each one.

Recall that when fixing A_i = x, we had the number of inversions be \text{inv} - B_i + y + z; where y was the number of elements larger than x before i, and z was the number of elements smaller than x after i.

First off, \text{inv} and all the B_i can be computed in \mathcal{O}(N \log N) time: this is a standard application of mergesort or segment/fenwick trees or (in C++) the order-statistics set.
A couple of articles detailing this can be found here (segment tree approach) and here (mergesort approach).

Now, let’s define C_x to be the corresponding y+z value for when we set A_i = x.
If we knew all the C_x values for index i, we’d just need to take the maximum of them, say M, and the answer would be \text{inv} - B_i + M.

Suppose we knew all the C_x values for index i.
Let’s see how they change when we move to index i+1.

  • First, for any x \gt A_{i+1}, A_{i+1} was contributing a count of 1 to C_x. This will no longer be the case because index i+1 will no longer be to the right.
    So, we need to decrement C_x for all x \gt A_{i+1} by 1.
  • Next, index i will newly become a ‘left’ index.
    This means, any x such that x \lt A_i will get 1 added to its C_x value.
  • These are the only changes!

Notice that our changes are essentially “subtract 1 from a range of C_x” and “add 1 to a range of C_x”; along with being able to know their maximum after these two operations.
That can be done quickly with a segment tree! (using lazy propagation, of course)

So, at each index we have two range updates and one range query (which is just a query for the whole range).
This can be achieved in \mathcal{O}(N\log N) time, which is good enough for us.

There are a couple of things to take care of:

  • First, we need to set the initial C_x values for index 1.
    This is not hard: for each i \gt 1, we must add 1 to C_x for all x \gt A_i; which comes out to N-1 more range updates — our segment tree can handle this.
  • More pressingly, we have not-necessarily-integer values of x; as well as very large values of x. Building a segment tree on them is somewhat hard.
    To get around this, we can use coordinate compression.
    Recall that the distinct elements in A were x_1 \lt x_2 \lt \ldots \lt x_k.
    Let’s relabel them into 1, 3, 5, 7, \ldots, 2k-1. This keeps the comparison relation between elements, while also giving us ‘space’ for real numbers:
    • Anything less than x_1 can be represented by 0
    • Anything more than x_k can be represented by 2k
    • Anything between x_i and x_{i+1} can be represented by 2i

After compression, we have an array of size 2N, and normal segtree stuff can be done on it.


\mathcal{O}(N\log N) per test case.


Author's code (C++)
//clear adj and visited vector declared globally after each test case
//check for long long overflow   
//Mod wale question mein last mein if dalo ie. Ans<0 then ans+=mod;
//Incase of close mle change language to c++17 or c++14  
//Check ans for n=1 
#pragma GCC target ("avx2")    
#pragma GCC optimize ("O3")  
#pragma GCC optimize ("unroll-loops")
#include <bits/stdc++.h>                   
#include <ext/pb_ds/assoc_container.hpp>  
#define int long long      
#define IOS std::ios::sync_with_stdio(false); cin.tie(NULL);cout.tie(NULL);cout.precision(dbl::max_digits10);
#define pb push_back 
#define mod 1000000007ll //998244353ll
#define lld long double
#define mii map<int, int> 
#define pii pair<int, int>
#define ll long long 
#define ff first
#define ss second 
#define all(x) (x).begin(), (x).end()
#define rep(i,x,y) for(int i=x; i<y; i++)    
#define fill(a,b) memset(a, b, sizeof(a))
#define vi vector<int>
#define setbits(x) __builtin_popcountll(x)
#define print2d(dp,n,m) for(int i=0;i<=n;i++){for(int j=0;j<=m;j++)cout<<dp[i][j]<<" ";cout<<"\n";}
typedef std::numeric_limits< double > dbl;
using namespace __gnu_pbds;
using namespace std;
typedef tree<int, null_type, less_equal<int>, rb_tree_tag, tree_order_statistics_node_update> indexed_set;
//member functions :
//1. order_of_key(k) : number of elements strictly lesser than k
//2. find_by_order(k) : k-th element in the set
const long long N=600015, INF=2000000000000000000;
const int inf=2e9 + 5;
lld pi=3.1415926535897932;
int lcm(int a, int b)
    int g=__gcd(a, b);
    return a/g*b;
int power(int a, int b, int p)
        return 0;
        int res=1;
        return res;
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());

int getRand(int l, int r)
    uniform_int_distribution<int> uid(l, r);
    return uid(rng);

int st[4*N],lazy[4*N],ar[N];

void propogate(int node, int l, int r)
void build(int node, int l, int r)
    int mid=(l+r)/2;
    build(node*2, l, mid);
    build(node*2+1, mid+1, r);
    st[node]=max(st[node*2], st[node*2+1]);
void update(int node, int l, int r, int x, int y, int val)
    propogate(node, l, r);
    int mid=(l+r)/2;
    update(node*2, l, mid, x, y, val);
    update(node*2+1, mid+1, r, x, y, val);
    st[node]=max(st[node*2], st[node*2+1]);
int query(int node, int l, int r, int x, int y)
    propogate(node, l, r);
    return -INF;
    return st[node];
    int mid=(l+r)/2;
    return max(query(node*2, l, mid, x, y), query(node*2+1, mid+1, r, x, y));

int count_inv(vector <int> a)
    int n=a.size(), ans=0;
    indexed_set s;
    for(auto num:a)
    return ans;
void compress(vector <int> &a)
    set <int> s;
    for(auto num:a)
    mii mp;
    int z=2;
    for(auto it:s)
    for(auto &num:a)
int32_t main()
    int t;
        int n;
        vector <int> a(n);
        for(auto &num:a)
        int n2=(2*n)+5;
        int base=count_inv(a);
        build(1, 1, n2);
        for(auto num:a)
            update(1, 1, n2, num+1, n2, 1);
        for(auto num:a)
            update(1, 1, n2, num+1, n2, -1);
            cout<<base+query(1, 1, n2, 1, n2)-query(1, 1, n2, num, num)<<" ";
            update(1, 1, n2, 1, num-1, 1);
Editorialist's code (C++)
// #pragma GCC optimize("O3,unroll-loops")
// #pragma GCC target("avx2,bmi,bmi2,lzcnt,popcnt")
#include "bits/stdc++.h"
using namespace std;
using ll = long long int;
mt19937_64 rng(chrono::high_resolution_clock::now().time_since_epoch().count());

#include <bits/extc++.h>
using namespace __gnu_pbds;
template<class T>
using indexed_set = tree<T, null_type, less<T>, rb_tree_tag, tree_order_statistics_node_update>;

struct Node {
	using T = int;
	T unit = 0;
	T f(T a, T b) { return max(a, b); }
	Node *l = 0, *r = 0;
	int lo, hi;
	T madd = 0;
	T val = unit;
	Node(int _lo,int _hi):lo(_lo),hi(_hi){}
	T query(int L, int R) {
		if (R <= lo || hi <= L) return unit;
		if (L <= lo && hi <= R) return val;
		return f(l->query(L, R), r->query(L, R));
	void add(int L, int R, T x) {
		if (R <= lo || hi <= L) return;
		if (L <= lo && hi <= R) {
			madd += x;
			val += x;
		else {
			push(), l->add(L, R, x), r->add(L, R, x);
			val = f(l->val, r->val);
	void push() {
		if (!l) {
			int mid = lo + (hi - lo)/2;
			l = new Node(lo, mid); r = new Node(mid, hi);
		if (madd)
			l->add(lo,hi,madd), r->add(lo,hi,madd), madd = 0;

int main()
	ios::sync_with_stdio(false); cin.tie(0);

	int t; cin >> t;
	while (t--) {
	    int n; cin >> n;
		vector<int> a(n);
		for (int &x : a) cin >> x;
		map<int, int> compress;
		for (auto &x : a) compress[x];
		int id = 1;
		for (auto &[x, y] : compress) {
			y = id;
			id += 2;
		for (auto &x : a) x = compress[x];

		ll tot = 0;
		vector<ll> without(n);
			indexed_set<array<int, 2>> st;
			for (int i = 0; i < n; ++i) {
				without[i] -= st.size() - st.order_of_key({a[i], i});
				tot -= without[i];
				st.insert({a[i], i});
			for (int i = 0; i < n; ++i) without[i] += tot;
			for (int i = n-1; i >= 0; --i) {
				without[i] -= st.order_of_key({a[i], i});
				st.insert({a[i], i});
		Node *seg = new Node(0, id);
		for (int i = 1; i < n; ++i) seg -> add(a[i]+1, id, 1);

		for (int i = 0; i < n; ++i) {
			cout << without[i] + seg -> query(0, id) << ' ';
			if (i+1 < n) seg -> add(a[i+1]+1, id, -1);
			seg -> add(0, a[i], 1);
		cout << '\n';
1 Like