CHEFSQUD - Editorial

akashbhalotia · May 24, 2020, 3:25pm

PROBLEM LINK:

Practice
Contest

Author: Ritesh Gupta
Tester: Raja Vardhan Reddy
Editorialist: Akash Bhalotia

DIFFICULTY:

Medium

PREREQUISITES:

BIT, Binary search, Coordinate compression, Two pointers, Inversion count.

PROBLEM:

Given an array A, let array B be composed of the inversion counts of every subarray of A. Find the median of array B. In case of multiple medians, output the left one.

HINTS:

Not the full solution. Just some hints to help you if you are stuck at some point. As soon as you encounter a hint that you had not thought of, try solving the problem on your own.

Hint 1:

There can be O(N^2) inversion counts over all the subarrays. We need to find which one of them is the median. The median will be the one for which there are \frac{(M+1)}{2}-1 subarrays which have an inversion count less than or equal to it, where M is the size of B.

As there can be O(N^2) possible candidates for this, we can perform a binary search on them to find the median.

Hint 2:

For our current binary search query, we’ll need to find how many subarrays have an inversion count less than or equal to it.

Try to think in the direction of: how many subarrays starting at an index i will satisfy this.

Hint 3:

All subarrays starting at i and ending at or before a j which satisfy this, only if the inversion count of [i,j] is less than or equal to our query, and of [i,j+1] is greater than our query number.

There will be (j-i+1) subarrays starting at i which satisfy this. Add this to the count of subarrays, move on to the next i, and try to extend j for it, in a sliding window fashion.

But how will be find the inversion count for a window efficiently?

Hint 4:

We can use a BIT for this. As A[i] can be as high as 10^9, we can use coordinate compression. Refer to this to understand how inversions can be counted for an array using a BIT.

When we extend the window, we query the BIT for the inversion count for this index, and then update the BIT for this index. When we move on to the next i, we remove the contribution of A[i] from our inversion count, and decrement its value in the BIT.

Try to think about the reason for this, and try to implement the solution on your own if you have understood it.

QUICK EXPLANATION:

show

Perform a binary search on all possible inversion counts and find the smallest one for which the number of subarrays with inversion counts less than or equal to it is \ge \frac{(M+1)}{2}-1, where M is the size of B. To do this efficiently, find the length of the largest subarray starting at an index i, such that its inversion count is less than or equal to our query number, and add this length to our subarray count. Do this for every i, efficiently using sliding window technique with a Binary-Indexed Tree.

EXPLANATION

show

A squad consists of one or more consecutive soldiers in the array.
\implies A squad refers to a subarray of the array.

The strength of a squad is the sum of the number of soldiers to the left of the current soldier with a greater height than him, for every soldier.
\implies Strength of a squad is the inversion count of that subarray.

Now, let’s rephrase the problem to:

Given an array A, let array B be composed of the inversion counts of every subarray of A. Find the median of array B. In case of multiple medians, output the left one.

Finding the median shall require us to possess the array B. B will have the inversion count of every subarray of A. Since there are N*(N+1)/2 subarrays of an array of size N, i.e., O(N^2) subarrays, we can’t do this. We will have to find a better way.

Let’s take sample example 3 to better visualise:

N=4
A= {4,3,2,1}
B= {0,0,0,0,1,1,1,3,3,6} (in sorted order).

B is made up of inversion counts of the subarrays of A. Inversion count of any subarray of A will always be in the range [0,N*(N-1)/2].

why

Inversion count will be 0 when the subarray is sorted in ascending order.

Inversion count will N*(N-1)/2 when the largest subarray, i.e., the whole array is sorted in descending order.

Other inversion counts between 0 and N*(N-1)/2 will depend on the relative order of the elements in the subarrays.

Median of B is its middle element when it is sorted. Let the size of B be M.
\implies The median will be the \frac{(M+1)}{2} th element of B
\implies There should be \frac{(M+1)}{2}-1 elements in B before the median.
\implies \frac{(M+1)}{2}-1 subarrays of A should have an inversion count \le the median. \ldots (1)

( Here we are considering 1 -based indexing of the arrays. Also, division here implies integer division, i.e., considering the floor value of the actual division. Note that the observations above related to the median will work when size of B is odd, as well as when the size of B is even. )

From (1), we can see that our median will be an inversion count in B such that exactly \frac{(M+1)}{2}-1 subarrays of A have an inversion count less than or equal to it. Thus, we can check this condition to determine if a particular inversion count is the median or not.

But there can be O(N^2) inversion counts. How to check all of them?

We don’t need to check all of them. Instead, we can perform a binary search on the inversion counts, and find the smallest inversion count which satisfies: The number of subarrays which have an inversion count less than or equal to this number is \ge \frac{(M+1)}{2}-1. Since all numbers in B will satisfy this after a certain point, the smallest of them will be the median.

Now, we know that we need to perform a binary search on inversion counts to obtain the median. But how to check the condition? How to find the number of subarrays of A which have an inversion count less than or equal to a particular number?

To do this, we shall use a Binary-Indexed Tree or a BIT, coupled with the two-pointers (also known as sliding window) technique. Instead of a BIT, we can also use a segment tree, but BIT takes less space, and is easier to implement.

As the elements in A can be as large as 10^9, and we are only interested in the relative order of the numbers, we shall perform a coordinate-compression on them to make it possible to operate on them using a BIT. Before proceeding further, make sure to learn how to find the inversion-count of an array using a BIT.

We shall now be learning how to apply the two-pointer technique with a BIT to solve this problem.

Since we want to find the number of subarrays with an inversion count less than or equal to a given inversion count, let’s iterate over the starting positions of the subarray.

For every i we shall be considering all subarrays that start at i, and end at j, where i \le j \le N. Start with the subarray (i,i). Increment the compressed position of A[i] by 1 in the BIT. Now find how many elements are greater than A[i] in our subarray, by finding the sum of values in all positions greater than our compressed position in the BIT.

Keep extending j as long as the inversion count remains less than or equal to our query inversion count. When j can no longer be extended without obtaining an inversion count greater than our query number, we know that all subarrays that start at i, and end at or before j satisfy the condition that their inversion counts are less than or equal to our query number. Thus, the number of subarrays starting at i which satisfy this condition is:

j-i+1

We can add this to our subarray count, keep j as it is and proceed to the next i.

Before proceeding to the next i, we will have to remove the contribution of A[i] from our inversion count. A[i] impacted every number to the right of it in our window, which is smaller than it. To obtain the contribution, we can get the sum of values all positions in the BIT which are less than the compressed position of A[i] and subtract this count from our current sliding inversion count. Moreover, as A[i] will be removed from our current window, we will have decrement its value in the BIT by 1. After this, we can proceed to the next i.

Thus, by using a BIT and two-pointers technique, we can calculate the number of subarrays which have an inversion count less than or equal to query number, and using binary search, find the smallest one which satisfies the condition described earlier. Thus, we’ll obtain our required median.

So, to solve the problem:

Perform a binary search on all possible inversion counts, and find the smallest one which satisfies the condition: the number of subarrays of A which have an inversion count less than or equal to this count is \ge \frac{(M+1)}{2}-1, where M is the size of B.
To check the condition, use two-pointer technique with a BIT. For every starting position i of the subarrays, find the highest position j such that the inversion count of the subarray (i,j) is less than or equal to the query inversion count. Add (j-i+1) as the count of subarrays which begin at i and satisfy the condition.
Move on to the next starting position of subarrays, i.e., the next i, while keeping j same as its previous value. Don’t forget to remove the contribution of the previous i from our window before moving on to the next i.

TIME COMPLEXITY:

show

The time complexity is: O(N*log^2 N)

Why?

There can be O(N^2) possible inversion counts. We are performing a binary search on them. But log (N^2) = 2*logN. Thus, there is a logN factor.

For every search query, we are using the two-pointers technique to iterate over the whole array, and for each position, we are perfoming some updates, or asking some queries on the BIT. Each update/query takes O(logN), and there are N indices. This adds an NlogN factor.

Thus, the final complexity is O(N*log^2 N).

It takes O(N) space for the BIT as well as for A.

SOLUTIONS:

Setter

#include <bits/stdc++.h>
 
#define int long long
#define endl "\n"
 
using namespace std;
 
const int N = 200001;
 
int a[N];
int n,middle,id;
int bit[N];
 
void update(int idx, int val)
{
	while(idx<=id)
	{
		bit[idx]+=val;
		idx+=idx&-idx;
	}
}
 
int pref(int idx)
{
	int ans=0;
	while(idx>0)
	{
		ans+=bit[idx];
		idx-=idx&-idx;
	}
	return ans;
}
 
bool check(int m, int flag)
{
	for(int i=0;i<=id;i++)
		bit[i] = 0;
 
	int i,j,curr_inv,ans;
	ans = curr_inv = i = j = 0;
 
	while(i < n && curr_inv <= m)
	{
		update(a[i], 1);
		curr_inv += (pref(id) - pref(a[i]));
 
		while(j <= i && curr_inv > m)
		{
			update(a[j], -1);
			curr_inv -= pref(a[j]-1);
			j++;
		}
 
		ans += (i-j+1);
		i++;
	}
 
	return ans <= middle;
}
 
int solve(int l, int h)
{
	int ans = h;
 
	while(l <= h)
	{
		int mid = (l+h)/2;
 
		if(check(mid, true))
			l = mid+1;
		else
		{
			ans = mid;
			h = mid-1;
		}
	}
 
	return ans;
}
 
int32_t main()
{
	ios_base::sync_with_stdio(false);
	cin.tie(NULL);
	cout.tie(NULL);
	
	int t;
	cin >> t;
 
	while(t--)
	{
		cin >> n;
 
		map <int, vector <int> > m;
 
		for(int i=0;i<n;i++)
		{
			cin >> a[i];
			m[a[i]].push_back(i);
		}
 
		id = 0;
 
		for(auto i:m)
		{
			id++;
 
			for(int j:i.second)
				a[j] = id;
		}
 
		int h = (n*(n+1))/2;
		middle = (h-1)/2;
 
		cout << solve(0, h) << endl;
	}
 
	return 0;
}

Tester

//raja1999
 
//#pragma comment(linker, "/stack:200000000")
//#pragma GCC optimize("Ofast")
//#pragma GCC target("sse,sse2,sse3,ssse3,sse4,avx,avx2")
 
#include <bits/stdc++.h>
#include <vector>
#include <set>
#include <map>
#include <string> 
#include <cstdio>
#include <cstdlib>
#include <climits>
#include <utility>
#include <algorithm>
#include <cmath>
#include <queue>
#include <stack>
#include <iomanip> 
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp> 
//setbase - cout << setbase (16)a; cout << 100 << endl; Prints 64
//setfill -   cout << setfill ('x') << setw (5); cout << 77 <<endl;prints xxx77
//setprecision - cout << setprecision (14) << f << endl; Prints x.xxxx
//cout.precision(x)  cout<<fixed<<val;  // prints x digits after decimal in val
 
using namespace std;
using namespace __gnu_pbds;
#define f(i,a,b) for(i=a;i<b;i++)
#define rep(i,n) f(i,0,n)
#define fd(i,a,b) for(i=a;i>=b;i--)
#define pb push_back
#define mp make_pair
#define vi vector< int >
#define vl vector< ll >
#define ss second
#define ff first
#define ll long long
#define pii pair< int,int >
#define pll pair< ll,ll >
#define sz(a) a.size()
#define inf (1000*1000*1000+5)
#define all(a) a.begin(),a.end()
#define tri pair<int,pii>
#define vii vector<pii>
#define vll vector<pll>
#define viii vector<tri>
#define mod (1000*1000*1000+7)
#define pqueue priority_queue< int >
#define pdqueue priority_queue< int,vi ,greater< int > >
#define int ll
 
typedef tree<
pii,
null_type,
less<pii>,
rb_tree_tag,
tree_order_statistics_node_update>
ordered_set;
 
 
//std::ios::sync_with_stdio(false);
int n,a[100005],c=0,bit[100010];
int update(int pos,int val){
	while(pos<c+3){
		bit[pos]+=val;
		pos+=pos&(-pos);
	}
	return 0;
}
int query(int pos){
	int res=0;
	while(pos>0){
		res+=bit[pos];
		pos-=pos&(-pos);
	}
	return res;
}
int check(int x){
	// number of segments less than x
	int prev,cur,cnt,val,i;
	prev=0;
	cur=0;
	cnt=0;
	rep(i,n+5){
		bit[i]=0;
	}
	rep(i,n){
		while(prev<n){
			val=query(c)-query(a[prev]);
			if(cur+val<x){
				update(a[prev],1);
				cur+=val;
				prev++;
			}
			else{
				cnt+=(prev-i);
				val=query(a[i]-1);
				cur-=val;
				update(a[i],-1);
				break;
			}
		}
		if(prev==n){
			cnt+=prev-i;
		}
	}
	int tot=(n*(n+1))/2;
	if(cnt<(tot+1)/2){
		return 1;
	}
	return 0;
}
map<int,int>mapi;
map<int,int>::iterator it;
main(){
	std::ios::sync_with_stdio(false); cin.tie(NULL);
	int t;
	cin>>t;
	while(t--){
		int ans=0,i;
		cin>>n;
		mapi.clear();
		rep(i,n){
			cin>>a[i];
			mapi[a[i]]=1;
		}
		c=1;
		for(it=mapi.begin();it!=mapi.end();it++){
			it->ss=c++;
		}
		rep(i,n){
			a[i]=mapi[a[i]];
		}
		int lo=1;
		int hi=n*(n-1)/2;
		while(lo<=hi){
			int mid=(lo+hi)/2;
			if(check(mid)){
				ans=mid;
				lo=mid+1;
			}
			else{
				hi=mid-1;
			}
		}
		cout<<ans<<"\n";
	}
	return 0;
}

Editorialist

//created by Whiplash99
import java.io.*;
import java.util.*;
class A
{
	private static HashMap<Integer,Integer> map;
	private static int idx;
	private static BIT FT;
	static class BIT //Binary-Indexed Tree
	{
	    int size;
	    int[] table;

	    BIT(int size)
	    {
	        table=new int[size+1];
	        this.size=size+1;
	    }
	    void update(int i, int delta) //update function
	    {
	        i++;
	        while(i<size)
	        {
	            table[i]+=delta;
	            i+=Integer.lowestOneBit(i); //i+=i&-i
	        }
	    }
	    int sum(int i) //query function
	    {
	        int s=0; i++;
	        while (i>0)
	        {
	            s+=table[i];
	            i-=Integer.lowestOneBit(i);//i-=i&-i
	        }
	        return s;
	    }
	    int rangeSum(int i, int j){return sum(j)-sum(i-1);} //range query
	}

	// Used this because Arrays.sort() uses quicksort in Java
	//which may have O(N^2) complexity for certain types of arrays.
	private static void shuffleArray(int[] arr)
	{
	    int n = arr.length;
	    Random rnd = new Random();
	    for(int i=0; i<n; ++i)
	    {
	        int tmp = arr[i];
	        int randomPos = i + rnd.nextInt(n-i);
	        arr[i] = arr[randomPos];
	        arr[randomPos] = tmp;
	    }
	}

	//counts the number of subarrays which have an inversion count
	//less than or equal to mid
	private static long count(int[] a, int N, long mid)
	{
	    Arrays.fill(FT.table,0);

	    long cur=0, cnt=0;
	    int l=0,r=-1,pos,next;

	    //l and r and the left and right borders of the current window
	    while (l<N&&r<N)
	    {
	        if(r<N-1)
	        {
	            pos=map.get(a[r+1]); //compressed position

	            //how much will extending our window add to the
	            //inversion count
	            next=FT.rangeSum(pos+1,idx);

	            //while the inversion count remains less than or equal to mid
	            //keep extending the window
	            while (cur+next<=mid&&r<N-1)
	            {
	                r++;
	                cur+=next;
	                FT.update(pos,1);

	                if(r<N-1)
	                {
	                    pos = map.get(a[r+1]);
	                    next = FT.rangeSum(pos + 1, idx);
	                }
	            }
	        }

	        //number of subarrays starting at l which satisfy the condition
	        cnt+=r-l+1;

	        //remove the contribution and effect of a[l] from
	        //current window count before moving on to the next l.
	        pos=map.get(a[l++]);
	        cur-=FT.sum(pos-1);
	        FT.update(pos,-1);
	    }
	    return cnt;
	}
	public static void main(String[] args) throws IOException
	{
	    Reader reader=new Reader();

	    int i,N;

	    int T=reader.nextInt();
	    StringBuilder sb=new StringBuilder();

	    while(T-->0)
	    {
	        N=reader.nextInt();

	        int[] a=new int[N], sorted=new int[N];
	        for(i=0;i<N;i++) a[i]=sorted[i]=reader.nextInt();

	        shuffleArray(sorted);
	        Arrays.sort(sorted);

	        map=new HashMap<>(); idx=1;
	        map.put(sorted[0],0);
	        for(i=1;i<N;i++) //coordinate compression
	        {
	            if(sorted[i]==sorted[i-1]) continue;
	            map.put(sorted[i],idx++);
	        }

	        FT=new BIT(idx+5);
	        long l=0,r=(long)N*(N-1)/2,mid,ans=r;
	        final long medianPos=(1+(long)N*(N+1)/2)/2;

	        while (l<=r)
	        {
	            mid=(l+r)/2;

	            //find the smallest one which satisfies the condition
	            if(count(a,N,mid)>=medianPos)
	            {
	                ans=mid;
	                r=mid-1;
	            }
	            else l=mid+1;
	        }
	        sb.append(ans).append("\n");
	    }
	    System.out.println(sb);
	}
	static class Reader
	{
	    final private int BUFFER_SIZE = 1 << 16;private DataInputStream din;private byte[] buffer;private int bufferPointer, bytesRead;
	    public Reader(){din=new DataInputStream(System.in);buffer=new byte[BUFFER_SIZE];bufferPointer=bytesRead=0;
	    }public Reader(String file_name) throws IOException{din=new DataInputStream(new FileInputStream(file_name));buffer=new byte[BUFFER_SIZE];bufferPointer=bytesRead=0;
	    }public String readLine() throws IOException{byte[] buf=new byte[64];int cnt=0,c;while((c=read())!=-1){if(c=='\n')break;buf[cnt++]=(byte)c;}return new String(buf,0,cnt);
	    }public int nextInt() throws IOException{int ret=0;byte c=read();while(c<=' ')c=read();boolean neg=(c=='-');if(neg)c=read();do{ret=ret*10+c-'0';}while((c=read())>='0'&&c<='9');if(neg)return -ret;return ret;
	    }public long nextLong() throws IOException{long ret=0;byte c=read();while(c<=' ')c=read();boolean neg=(c=='-');if(neg)c=read();do{ret=ret*10+c-'0';}while((c=read())>='0'&&c<='9');if(neg)return -ret;return ret;
	    }public double nextDouble() throws IOException{double ret=0,div=1;byte c=read();while(c<=' ')c=read();boolean neg=(c=='-');if(neg)c = read();do {ret=ret*10+c-'0';}while((c=read())>='0'&&c<='9');if(c=='.')while((c=read())>='0'&&c<='9')ret+=(c-'0')/(div*=10);if(neg)return -ret;return ret;
	    }private void fillBuffer() throws IOException{bytesRead=din.read(buffer,bufferPointer=0,BUFFER_SIZE);if(bytesRead==-1)buffer[0]=-1;
	    }private byte read() throws IOException{if(bufferPointer==bytesRead)fillBuffer();return buffer[bufferPointer++];
	    }public void close() throws IOException{if(din==null) return;din.close();}
	}
}

Feel free to share your approach if it differs. You can ask your doubts below. Please let me know if something’s unclear. I would LOVE to hear suggestions

myselfsk · May 24, 2020, 7:13pm

Can anyone post the solution in python?

rahul_g · May 24, 2020, 8:26pm

You set complete tree to zero in each test case, that makes your complexity \mathcal{O}(T \cdot maxn \cdot log^2n).

wild4stone · May 25, 2020, 10:30am

can anyone explain why is this giving TLE :
CodeChef: Practical coding for everyone

The solution described above is O(N*logN^2) and same as per the editorial

sayan_1999 · May 25, 2020, 2:04pm

@akashbhalotia Sir,I would really like to know how you yourself came up with the solution within the tight time limits of the contest?

sanjay_thiru · May 26, 2020, 9:33am

This is an amazing question. Learnt quite a lot. I have commented the tester’s solution. Just posting here so that it will be useful for someone. If there is anything wrong in my understanding, kindly please point out.

https://www.codechef.com/viewsolution/33353455

rishup_nitdgp · May 26, 2020, 9:39am

Your implementation looks ok to me. I think…TLE is due to recursive segment tree.

sayan_1999 · July 21, 2020, 2:25pm

@rishup_nitdgp if you can kindly answer. What if using the binary search, we land upon a value that satisfies the criteria but that actually does not occur in the array B? How do you ensure that will never happen?
Thanks in advance…