# MORE_INV - Editorial

Author: yash_daga
Tester and Editorialist: iceknight1093

2803

# PREREQUISITES:

Inversion counting, Segment trees (with lazy propagation)

# PROBLEM:

You’re given an array A. For each 1 \leq i \leq N independently, find the following:

• If you can change A_i to any real number, what’s the maximum possible number of inversions in A?

# EXPLANATION:

Let x_1 \lt x_2 \lt \ldots \lt x_k be the distinct values be present in A.

The main observation required to solve this task is as follows:
Suppose we set A_i = x, then there are only about 2k + 1 ‘interesting’ values of x, namely:

• One of the x_i; or
• Something less than x_1 (they’re all equivalent, so we can treat this as x_1 - 1); or
• Something greater than x_k (once again they’re all equivalent, so we can consider this to be x_k + 1); or
• Something strictly between x_i and x_{i+1} for some i.
Once again, we can consider this to just be x_i + 0.5.

Let’s apply this information to solve the problem.

The easiest subtask allows for pretty much brute-force computation.

As discussed above, there are at worst 2N+1 distinct values of x we care about.
So, for each index i:

• Set A_i = x for each of the ‘interesting’ x.
• Compute the number of inversions in the resulting array.
• Print the maximum across all x.

Even if inversions are computed in \mathcal{O}(N^2) each time, this will run in \mathcal{O}(N^4) time overall which is enough to pass this subtask.

N is larger now, though still small enough that \mathcal{O}(N^2) or similar can pass.
Earlier, for each i, we set A_i = x and computed the number of inversions in \mathcal{O}(N^2).

Let’s optimize the second part.
When we set A_i = x, notice that inversion pairs that don’t involve index i don’t really change.
So, we don’t really need to recompute the total inversion number each time.

In particular,

• Let \text{inv} be the number of inversions in the original array A.
Let B_i be the number of inversions involving index i.
• When we set A_i = x,
• Suppose there are y elements before index i that are larger than x.
• Suppose there are z elements after index i that are smaller than x.
• Then, the number of inversions in the new array is exactly \text{inv} - B_i + y + z.

Of these quantities, \text{inv} can be precomputed in \mathcal{O}(N^2), and B_i for all i can be computed at the same time (whenever you see i \lt j with A_i \gt A_j, add one to both B_i and B_j).
So, if we were able to quickly compute y and z, we’d have a very quick algorithm.

There are several ways to do this, for example:

• Let P_i be a sorted list of the elements [A_1, A_2, \ldots, A_i].
Similarly, let S_i be a sorted list of [A_i, A_{i+1}, \ldots, A_N].
• Then, when fixing A_i = x, y and z can be computed by binary searching on P_{i-1} and S_{i+1} respectively; to find the number of elements in the prefix/suffix that are larger/smaller than x.

Since N \leq 1000, computing all the P_i and S_i lists using \mathcal{O}(N^2) memory is not an issue.

We’re binary searching \mathcal{O}(N^2) times for a solution in \mathcal{O}(N^2 \log N) overall, which is enough to get AC.
It’s possible write the solution to run in \mathcal{O}(N^2) as well (using two-pointers), but that isn’t necessary to get AC.

We can no longer independently try every x for each position.
However, we can still maintain enough information for each one.

Recall that when fixing A_i = x, we had the number of inversions be \text{inv} - B_i + y + z; where y was the number of elements larger than x before i, and z was the number of elements smaller than x after i.

First off, \text{inv} and all the B_i can be computed in \mathcal{O}(N \log N) time: this is a standard application of mergesort or segment/fenwick trees or (in C++) the order-statistics set.
A couple of articles detailing this can be found here (segment tree approach) and here (mergesort approach).

Now, let’s define C_x to be the corresponding y+z value for when we set A_i = x.
If we knew all the C_x values for index i, we’d just need to take the maximum of them, say M, and the answer would be \text{inv} - B_i + M.

Suppose we knew all the C_x values for index i.
Let’s see how they change when we move to index i+1.

• First, for any x \gt A_{i+1}, A_{i+1} was contributing a count of 1 to C_x. This will no longer be the case because index i+1 will no longer be to the right.
So, we need to decrement C_x for all x \gt A_{i+1} by 1.
• Next, index i will newly become a ‘left’ index.
This means, any x such that x \lt A_i will get 1 added to its C_x value.
• These are the only changes!

Notice that our changes are essentially “subtract 1 from a range of C_x” and “add 1 to a range of C_x”; along with being able to know their maximum after these two operations.
That can be done quickly with a segment tree! (using lazy propagation, of course)

So, at each index we have two range updates and one range query (which is just a query for the whole range).
This can be achieved in \mathcal{O}(N\log N) time, which is good enough for us.

There are a couple of things to take care of:

• First, we need to set the initial C_x values for index 1.
This is not hard: for each i \gt 1, we must add 1 to C_x for all x \gt A_i; which comes out to N-1 more range updates — our segment tree can handle this.
• More pressingly, we have not-necessarily-integer values of x; as well as very large values of x. Building a segment tree on them is somewhat hard.
To get around this, we can use coordinate compression.
Recall that the distinct elements in A were x_1 \lt x_2 \lt \ldots \lt x_k.
Let’s relabel them into 1, 3, 5, 7, \ldots, 2k-1. This keeps the comparison relation between elements, while also giving us ‘space’ for real numbers:
• Anything less than x_1 can be represented by 0
• Anything more than x_k can be represented by 2k
• Anything between x_i and x_{i+1} can be represented by 2i

After compression, we have an array of size 2N, and normal segtree stuff can be done on it.

# TIME COMPLEXITY

\mathcal{O}(N\log N) per test case.

# CODE:

Author's code (C++)
//clear adj and visited vector declared globally after each test case
//check for long long overflow
//Mod wale question mein last mein if dalo ie. Ans<0 then ans+=mod;
//Incase of close mle change language to c++17 or c++14
//Check ans for n=1
#pragma GCC target ("avx2")
#pragma GCC optimize ("O3")
#pragma GCC optimize ("unroll-loops")
#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
#define int long long
#define IOS std::ios::sync_with_stdio(false); cin.tie(NULL);cout.tie(NULL);cout.precision(dbl::max_digits10);
#define pb push_back
#define mod 1000000007ll //998244353ll
#define lld long double
#define mii map<int, int>
#define pii pair<int, int>
#define ll long long
#define ff first
#define ss second
#define all(x) (x).begin(), (x).end()
#define rep(i,x,y) for(int i=x; i<y; i++)
#define fill(a,b) memset(a, b, sizeof(a))
#define vi vector<int>
#define setbits(x) __builtin_popcountll(x)
#define print2d(dp,n,m) for(int i=0;i<=n;i++){for(int j=0;j<=m;j++)cout<<dp[i][j]<<" ";cout<<"\n";}
typedef std::numeric_limits< double > dbl;
using namespace __gnu_pbds;
using namespace std;
typedef tree<int, null_type, less_equal<int>, rb_tree_tag, tree_order_statistics_node_update> indexed_set;
//member functions :
//1. order_of_key(k) : number of elements strictly lesser than k
//2. find_by_order(k) : k-th element in the set
const long long N=600015, INF=2000000000000000000;
const int inf=2e9 + 5;
lld pi=3.1415926535897932;
int lcm(int a, int b)
{
int g=__gcd(a, b);
return a/g*b;
}
int power(int a, int b, int p)
{
if(a==0)
return 0;
int res=1;
a%=p;
while(b>0)
{
if(b&1)
res=(1ll*res*a)%p;
b>>=1;
a=(1ll*a*a)%p;
}
return res;
}

int getRand(int l, int r)
{
uniform_int_distribution<int> uid(l, r);
return uid(rng);
}

int st[4*N],lazy[4*N],ar[N];

void propogate(int node, int l, int r)
{
if(l!=r)
{
lazy[node*2]+=lazy[node];
lazy[node*2+1]+=lazy[node];
}
st[node]+=lazy[node];
lazy[node]=0;
}
void build(int node, int l, int r)
{
if(l==r)
{
st[node]=0;
lazy[node]=0;
return;
}
int mid=(l+r)/2;
build(node*2, l, mid);
build(node*2+1, mid+1, r);
st[node]=max(st[node*2], st[node*2+1]);
lazy[node]=0;
return;
}
void update(int node, int l, int r, int x, int y, int val)
{
if(lazy[node]!=0)
propogate(node, l, r);
if(y<x||x>r||y<l)
return;
if(l>=x&&r<=y)
{
st[node]+=val;
if(l!=r)
{
lazy[node*2]+=val;
lazy[node*2+1]+=val;
}
return;
}
int mid=(l+r)/2;
update(node*2, l, mid, x, y, val);
update(node*2+1, mid+1, r, x, y, val);
st[node]=max(st[node*2], st[node*2+1]);
return;
}
int query(int node, int l, int r, int x, int y)
{
if(lazy[node]!=0)
propogate(node, l, r);
if(y<x||y<l||x>r)
return -INF;
if(l>=x&&r<=y)
return st[node];
int mid=(l+r)/2;
return max(query(node*2, l, mid, x, y), query(node*2+1, mid+1, r, x, y));
}

int count_inv(vector <int> a)
{
int n=a.size(), ans=0;
indexed_set s;
for(auto num:a)
{
ans+=s.order_of_key(-num);
s.insert(-num);
}
return ans;
}
void compress(vector <int> &a)
{
set <int> s;
for(auto num:a)
s.insert(num);
mii mp;
int z=2;
for(auto it:s)
mp[it]=z++;
for(auto &num:a)
num=mp[num];
}
int32_t main()
{
IOS;
int t;
cin>>t;
while(t--)
{
int n;
cin>>n;
vector <int> a(n);
rep(i,0,n)
cin>>a[i];
compress(a);
for(auto &num:a)
num*=2;
int n2=(2*n)+5;
int base=count_inv(a);
build(1, 1, n2);
for(auto num:a)
update(1, 1, n2, num+1, n2, 1);
for(auto num:a)
{
update(1, 1, n2, num+1, n2, -1);
cout<<base+query(1, 1, n2, 1, n2)-query(1, 1, n2, num, num)<<" ";
update(1, 1, n2, 1, num-1, 1);
}
cout<<"\n";
}
}

Editorialist's code (C++)
// #pragma GCC optimize("O3,unroll-loops")
// #pragma GCC target("avx2,bmi,bmi2,lzcnt,popcnt")
#include "bits/stdc++.h"
using namespace std;
using ll = long long int;
mt19937_64 rng(chrono::high_resolution_clock::now().time_since_epoch().count());

#include <bits/extc++.h>
using namespace __gnu_pbds;
template<class T>
using indexed_set = tree<T, null_type, less<T>, rb_tree_tag, tree_order_statistics_node_update>;

struct Node {
using T = int;
T unit = 0;
T f(T a, T b) { return max(a, b); }

Node *l = 0, *r = 0;
int lo, hi;
T val = unit;
Node(int _lo,int _hi):lo(_lo),hi(_hi){}
T query(int L, int R) {
if (R <= lo || hi <= L) return unit;
if (L <= lo && hi <= R) return val;
push();
return f(l->query(L, R), r->query(L, R));
}
void add(int L, int R, T x) {
if (R <= lo || hi <= L) return;
if (L <= lo && hi <= R) {
val += x;
}
else {
val = f(l->val, r->val);
}
}
void push() {
if (!l) {
int mid = lo + (hi - lo)/2;
l = new Node(lo, mid); r = new Node(mid, hi);
}
}
};

int main()
{
ios::sync_with_stdio(false); cin.tie(0);

int t; cin >> t;
while (t--) {
int n; cin >> n;
vector<int> a(n);
for (int &x : a) cin >> x;

map<int, int> compress;
for (auto &x : a) compress[x];
int id = 1;
for (auto &[x, y] : compress) {
y = id;
id += 2;
}
for (auto &x : a) x = compress[x];

ll tot = 0;
vector<ll> without(n);
{
indexed_set<array<int, 2>> st;
for (int i = 0; i < n; ++i) {
without[i] -= st.size() - st.order_of_key({a[i], i});
tot -= without[i];
st.insert({a[i], i});
}
for (int i = 0; i < n; ++i) without[i] += tot;
st.clear();
for (int i = n-1; i >= 0; --i) {
without[i] -= st.order_of_key({a[i], i});
st.insert({a[i], i});
}
}

Node *seg = new Node(0, id);
for (int i = 1; i < n; ++i) seg -> add(a[i]+1, id, 1);

for (int i = 0; i < n; ++i) {
cout << without[i] + seg -> query(0, id) << ' ';
if (i+1 < n) seg -> add(a[i+1]+1, id, -1);