# KPERM - Editorial

Contest

Author: Ayush Ranjan
Tester: Raja Vardhan Reddy
Editorialist: Rajarshi Basu

Easy-Medium

# PREREQUISITES:

Dynamic Programming, Knowledge of inversions and permutations.

# PROBLEM:

We are given Q queries of the following type:
Print the K^{th} lexicographically smallest permutation of [1,N] with exactly R inversions.
N \leq 500
R \leq 200,000
K\leq 10^{18}
Q \leq 2000

# QUICK EXPLANATION:

Initially, build a table dp[N_{max}][R_{max}], where dp[i][j] means number of permutations of length i with exactly j inversions. Reconstruct the solution for each query: greedily take the smallest possible value for each index, as long as enough number of permutations can be constructed while considering the rest of the constraints.

# DETAILED EXPLANATION:

I will elaborate on the solution in terms of step-by-step observations.

Since N is just 500 and even R is atmost 2*10^5, nothing much can be guessed about the expected complexity. However, we can say that K will definitely not be in our complexity analysis.

First thoughts

Often, a good way to approach such problems, where we have to print K^{th} lexicographically smallest something, is to think if we can keep greedily fixing elements, as long as the reduced parameters allow for at least X more permutations. Maybe we can do so here?

Observation 1

A curious property is that we can calculate how many permutations of length M (M < N) are possible with R_m (R_m < R) inversions, without needing to know what the first N-M values are.

More explanation

What do I mean by that? Let’s say we are dealing with the numbers \{1,2,3,4,5\} and want to see how many permutations with y inversions are possible. Lets say we fix the first number as 3. Now we are left with \{1,2,4,5\}. However, note that the gap is not important really in our calculation of how many permutations with (let’s say) x inversions we can get from this set. Our set is essentially the same as \{1,2,3,4\} for all purposes since while counting inversions, we only need to know greater than or less than values. Hence for our purposes, any 4 distinct numbers would have given the same number of permutations.

this hints towards some precomputation. In particular, maybe if we calculate a table dp[N_{max}][R_{max}], where dp[i][j] means number of permutations of length i with exactly j inversions, perhaps it will help us later. Note that we will be calculating this table as a precomputation step, without processing any queries. We don’t need to recompute anything during the queries since for a particular i and j, dp[i][j] is always constant, irrespective of the values already taken.

Building the DP

The DP table can be built on the following recurrence:
dp[n][r] = \displaystyle\sum_{k = 1}^{min(n,r+1)}dp[n-1][r-(k-1)]
The rationale behind arriving at this dp[.][.] recurrence is that we are essentially trying to place k as the first element in our permutation, and considering the rest of the premutation of
n-1 elements with r-(k-1) inversions.

Full Solution
Hint:

Try to recreate the K^{th} smallest permutation, using the dp[.][.] table.

Recreating: Details

Let’s say we have a function getSmallest(int N, int R) which returns the optimal number that should be placed at the start of the N length permutation. Let’s say the number placed is x. Observe, this contributes exactly x-1 to the total number of inversions R. Hence, R_{new} = R - (x-1). Now we call getSmallest(N-1,Rnew), and continue.

Now, also observe that the optimal x is also the smallest value possible which ensures that dp[N-1][R_{new}] \geq K . (unless obviously N=1, in which case it doesn’t matter). This means that had we put some value less than x, then we would have had <k permutations formable using the given constraints.

Implementation 1

A quick thing to note is that the number of permutations will be very VERY large. Hence a trick will be to store any value larger than 10^{18} as -1, as the maximum value of K is 10^{18}, and hence any value larger than that is all the same since we just have to compare it to K while restoring. This means that while computing the dp[.][.], we cannot use prefix sums to make the transitions faster. Instead, we just add the values one by one. This makes for a O(N) transition, and since we have O(N^3) states, the total time taken should be O(N^4). However, if we break out of the loop the first time we encounter a -1, the number of iterations reduces drastically, and it works really fast. Intuition behind this is that the number of permutations with a even say 20 inversions and N=50 (say), is really really large. Hence a large portion of the dp[.][.] table is essentially filled with -1.

Time Complexity

We have a precomputation step of O(N^4), which is effectively O(N^3) amortised due to the break statements. During the queries, we, in worst case, will be looping over all possible numbers in getSmallest(...). Since we call it O(N) times, our time complexity becomes O(Q*N^2). However, again note that in most cases, most of the first elements of the permutation will just be 1,2,3,4, ... for larger values of N. Hence, it runs much faster in practice.

## Challenge:

Do the queries in O(N log N) per query.

# QUICK REMINDERS:

Keep using those long long ints.

# SOLUTIONS:

Setter's Solution
#include<bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
using namespace __gnu_pbds;
using namespace std;

typedef tree<int ,null_type,less<int>,rb_tree_tag,tree_order_statistics_node_update> ordered_set;

#define ll long long
#define siz(i) ((i) * ((i) - 1) / 2)

const ll INF = 1e18, N = 501;

vector<ll> dp[N];

inline ll get_sum(int idx, int l, int r){
l = max(l, 0);
r = min(r, siz(idx));
ll sum = 0;
for(int i = l; i <= r; i++){
sum += dp[idx][i];
if(sum >= INF)
return INF;
}
return sum;
}

int main(){
ios_base::sync_with_stdio(false);cin.tie(0);cout.tie(0);
for(int i = 1; i < N; i++)
dp[i].resize(siz(i) + 1);
for(int i = 1; i < N; i++){
int s = siz(i);
dp[i][0] = dp[i][s] = 1;
for(int j = 1; j <= s / 2; j++)
dp[i][j] = dp[i][s - j] = get_sum(i - 1, j - i + 1, j);
}
int t;
cin >> t;
while(t--){
int n, inv;
ll k;
cin >> n >> inv >> k;
if(inv > siz(n) || dp[n][inv] < k){
cout << -1 << "\n";
continue;
}
ordered_set remain;
for(int i = 1; i <= n; i++)
remain.insert(i);
for(int i = 0; i < n; i++){
int st = 1, en = n - i, lst = 1;
while(st <= en && i < n - 1){
int mid = st + en >> 1;
if(inv - mid + 1 < 0)
en = mid - 1;
else if(get_sum(n - i - 1, inv - mid + 1, inv) >= k){
lst = mid, en = mid - 1;
}
else st = mid + 1;
}
k -= get_sum(n - i - 1, inv - lst + 2, inv);
inv -= lst - 1;
lst = *remain.find_by_order(lst - 1);
remain.erase(lst);
cout << lst << ' ';
}
cout << "\n";
}
}

Tester's Solution
//raja1999

//#pragma GCC optimize("Ofast")
//#pragma GCC target("sse,sse2,sse3,ssse3,sse4,avx,avx2")

#include <bits/stdc++.h>
#include <vector>
#include <set>
#include <map>
#include <string>
#include <cstdio>
#include <cstdlib>
#include <climits>
#include <utility>
#include <algorithm>
#include <cmath>
#include <queue>
#include <stack>
#include <iomanip>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
//setbase - cout << setbase (16)a; cout << 100 << endl; Prints 64
//setfill -   cout << setfill ('x') << setw (5); cout << 77 <<endl;prints xxx77
//setprecision - cout << setprecision (14) << f << endl; Prints x.xxxx
//cout.precision(x)  cout<<fixed<<val;  // prints x digits after decimal in val

using namespace std;
using namespace __gnu_pbds;
#define f(i,a,b) for(i=a;i<b;i++)
#define rep(i,n) f(i,0,n)
#define fd(i,a,b) for(i=a;i>=b;i--)
#define pb push_back
#define mp make_pair
#define vi vector< int >
#define vl vector< ll >
#define ss second
#define ff first
#define ll long long
#define pii pair< int,int >
#define pll pair< ll,ll >
#define sz(a) a.size()
#define inf (1000*1000*1000+5)
#define all(a) a.begin(),a.end()
#define tri pair<int,pii>
#define vii vector<pii>
#define vll vector<pll>
#define viii vector<tri>
#define mod (1000*1000*1000+7)
#define pqueue priority_queue< int >
#define pdqueue priority_queue< int,vi ,greater< int > >
#define int ll

typedef tree<
int,
null_type,
less<int>,
rb_tree_tag,
tree_order_statistics_node_update>
ordered_set;

//std::ios::sync_with_stdio(false);
#define ld long double
const int N=500;
vector<vector< int > >dp(N+5);
vector<vector< __int128 > >sum(N+5);
int done[N+5];
main(){
std::ios::sync_with_stdio(false); cin.tie(NULL);
int t,n,r,pow8,pow10,pow18,i,j,val1;
__int128 iinf=inf;
__int128 val;
iinf*=inf;
n=N;
//double clk=clock();
r=n*(n-1)/2;
cin>>t;
rep(i,N+5){
dp[i].resize(r+5);
sum[i].resize(r+5);
}
rep(i,r+1){
if(i==0){
dp[1][i]=1;
}
else{
dp[1][i]=0;
}
sum[1][i]=i>0?sum[1][i-1]+dp[1][i]:dp[1][i];
}
f(i,2,n+1){
rep(j,r+1){
val=sum[i-1][j];
if((j-(i-1))>0){
val-=sum[i-1][j-i];
}
val=min(val,iinf);
dp[i][j]=val;
sum[i][j]=j>0?sum[i][j-1]+dp[i][j]:dp[i][j];
}
}
//cout<<(clock()-clk)/CLOCKS_PER_SEC<<"\n";
while(t--){
int k,req,c;
cin>>n>>r>>k;
if(r>(n*(n-1))/2 || k>dp[n][r]){
cout<<-1<<"\n";
continue;
}
rep(i,n){
done[i]=0;
}
req=r;
fd(i,n-1,1){
val=0;
c=0;
rep(j,n){
if(done[j]==0){
val+=dp[i][req-c];
if(val>=k){
break;
}
c++;
}
}
cout<<j+1<<" ";
done[j]=1;
k-=(val-dp[i][req-c]);
req-=c;
}
rep(j,n){
if(done[j]==0){
cout<<j+1<<" ";
}
}
cout<<"\n";
}
return 0;
}


Editorialist's Solution
#include <iostream>
#include <vector>
#include <set>
#include <iomanip>
#include <algorithm>
#include <functional>
#include <stdio.h>
#include <cmath>
#include <queue>
#include <string>
#include <map>
#include <complex>
#include <stack>
#include <set>

#define FOR(i,n) for(int i=0;i<n;i++)
#define FORE(i,a,b) for(int i=a;i<=b;i++)
#define ll long long int
#define vi vector<int>
#define ii pair<int,int>
#define pb push_back
#define mp make_pair
#define ff first
#define ss second
#define pll pair<ll,ll>
#define cd complex<double>
#define vv vector

using namespace std;

const ll INF = 1e18;
const int MAXN = 500+5;
const int MAXR = 2e5+5;
const ll MOD = 1e9 + 7;

ll dp[MAXN][MAXR];
int firstPos[MAXR]; // stores the last i st dp[i][j] > 1e18
void precalc(){
ll psum[MAXR];
FOR(i,MAXR)psum[i] = 0;
FOR(i,MAXN){
if(i == 0)continue;
if(i == 1){
dp[i][0] = 1;
continue;
}
int i2 = (i*(i-1))/2;

FOR(j,i2+2){

//dp[i][j] = dp[i-1][j] + dp[i-1][j-1] + .. + dp[i-1][j-(i-1)];
ll sum = 0;
/*if(psum[j] - (j-i>=0?psum[j-i]:0) > 0){
dp[i][j] = -1;
continue;
}*/
FOR(k,i){
if(sum > INF){
sum = -1;
break;
}
if(dp[i-1][j-k] == -1){
sum = -1;
break;
}
sum += dp[i-1][j-k];
}
if(sum > INF){
sum = -1;

}
dp[i][j] = sum;
}
psum[0] = dp[i][0] == -1;
FORE(j,1,MAXR-1)psum[j] = psum[j-1] + (dp[i][j] == -1);
}
}

int main(){
precalc();
//cout << dp[7][10] << endl;

ios_base::sync_with_stdio(0);
cin.tie(0);
int t;
cin >> t;
while(t--){
int n,r;
ll k;
cin >> n >> r >> k;
vi ans;
if(dp[n][r] > -1 and k > dp[n][r]){
cout << -1 << endl;
continue;
}else{

for(int i = n;i>=1;i--){
if(i == 1){
ans.pb(r);
break;
}
FOR(j,i){
//if(r-j < 0)
//	cout << i-1 << " " << r-j << "  " << dp[i-1][r] << endl;
//if(r < 0)while(1);
if(dp[i-1][r] == -1 or dp[i-1][r] >= k){
ans.pb(j);
break;
}else{

k -= dp[i-1][r];
r--;
}
}
}

}

if(ans.size() < n)return 0;
//cout << ans.size() << endl;
bool taken[n];
FOR(i,n)taken[i] = 0;
FOR(i,n){
int ctr = 0;
FOR(j,n){
if(!taken[j]){
ctr++;
}
if(ctr > ans[i]){
taken[j] = 1;
cout << j+1 << " ";
break;
}
}
}
cout << endl;
}

return 0;
}


Please give me suggestions if anything is unclear so that I can improve. Thanks

13 Likes

Use binary seach at each position to find smallest x such that DP[x-1][R-(x-1)] \geq K

4 Likes
However, note that the gap is not important really in our calculation of how many permutations with (let’s say) x inversions we can get from this set.


The “More Explanation” heading isn’t clear.

1 Like

You’ll actually need to check sum of dp values from 1 to x - 1. You’ll also need to use ordered set to restore.

In the editorialist’s code where we are finding ‘j’ such that dp[i-1][r-j] >= k, why are we decreasing ‘k’ i.e. k -= dp[i-1][r-j] ??
Shouldn’t we find smallest ‘j’ such that dp[i-1][r-j] >= k without decreasing ‘k’ in the else part…

1 Like

Yes we are. However, that means we have already used up dp[i-1][r-j] permutations. Think about it in this way - we are trying to find the kth number (yes, we can just print K. Seems trivial but bear with me. ). Let us try to fix the first digit. Let us say K is 6421. Thus we can try to find the first digit. Note how when we iterate over say 1, and then move onto 2, we kindof might think that we have alredy used up 10001000 values, so decrease k by 1000? (here dp[i][j] means number of numbers starting with i, having j digit. obviously dp[i][j] = 10^j10
j
). Hope this provides some intuition.

2 Likes

Got it…Thanks

@rajarshi_basu Sir,what do you indicate by length of the permutation?

1 Like

how is the n^4 precomputation amortised n^3 ?

hey @rajarshi_basu can u please explain the part where u restored the permutation.

In setter’s solution, if I am returning sum instead of INF in for loop of get_sum, solution is failing. Can anyone help me understand why?

Hi,
I did not include all the details so that there is some stuff left for you to think as well

Think about how we can recreate (I have given some outline in the editorial as well in fullsolution>recreating:details> , go thru the editorialist code. Try to reason why, when I am not being able to place a character, I am subtracting from K, the dp[.][.] of the character.

1 Like

haha…will think about it …thanks for the help.

Can someone explain the approach of dp more clearly

This question is awesome and so is the editorial. Thank you.

What if the question was lexicographically largest permutation?