SEQGOODNESS - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: prince_patel_8
Tester: udhav2003
Editorialist: iceknight1093

DIFFICULTY:

2167

PREREQUISITES:

Basic combinatorics and computing binomial coefficients, sorting

PROBLEM:

The goodness of a sequence S is defined as follows:

  • Sort S in increasing order;
  • Then, its goodness is the number of indices i such that S_i = i.

Given an array A, find the sum of goodness across all its subsequences.

EXPLANATION:

First, let’s try to solve this problem for when all the elements of A are distinct.

For this, we can use the technique of looking at contribution.
That is, instead of finding the sum of goodness across all sequences, we’ll find for each element the number of sequences it contributes to the goodness of.

Since all the elements are distinct, this is not too hard.
Let’s consider an element x in A, we want to count the number of subsequences of A such that S_x = x.
For this:

  • There should be exactly x-1 elements less than x, to be placed at indices 1, 2, \ldots, x-1.
  • There can be any number of elements \gt x, since they don’t affect the position of x in the sorted subsequence.

This then turns into a rather simply counting problem:

  • If there are L elements less than x in A, we can choose x-1 of them in \binom{L}{x-1} ways.
  • If there are R elements greater than x in A, we can choose any subset of them freely in 2^R ways.
  • Multiplying them together, we get \binom{L}{x-1} \cdot 2^R ways.

Finding L and R isn’t very hard.
Let’s sort the array A, then if A_i = x in this sorted array we have L = i-1 and R = N-i.

So, after sorting, we can solve the problem in \mathcal{O}(N): for each index i, add \binom{i-1}{A_i-1} \cdot 2^{N-i} to the answer.

Computing binomial coefficients quickly can be done by precomputing factorials, for example as described here.


Now, what happens if the elements aren’t necessarily distinct?
In fact, the exact same solution works!
That is, the answer is simply

\sum_{i=1}^N \binom{i-1}{A_i-1}\cdot 2^{N-i}

after A is sorted.

Proof

When elements aren’t distinct, the only real issue that we face is that which element is placed at which position in sorted order isn’t uniquely determined; which might throw off our contribution counting.
For example, the subsequence [1, 1] should be counted against exactly one of the ones; not both of them.

One way to resolve this ambiguity is to uniquely define an order among equal elements as well.
For example, if we have k ones in A, we can label them [1_1, 1_2, 1_3, \ldots, 1_k], and we can say that if 1_i and 1_j are both chosen in a subsequence (where i \lt j), then 1_i will be placed before 1_j in sorted order.

This immediately resolves the ambiguity, since if S_i = i for some subsequence, it will contribute only towards one specific i (depending on the labels of the i's chosen for this subsequence).

Now, note that because we have a uniquely defined order among all the elements, they’re all essentially distinct!
So, our initial solution, where we treated elements as distinct, works here too.

TIME COMPLEXITY

\mathcal{O}(N\log N) per test case.

CODE:

Author's code (C++)
#include <bits/stdc++.h>
 
using namespace std;
 
#define int long long int
 
const int N = 2e5 + 10;
vector<int> fact(N);
vector<int> inv(N);
const int mod = (int)1e9 + 7;
 
int power(int x, int y, int p){
    int res = 1;
    x = x % p;
    if (x == 0)
        return 0;
    while (y > 0){
        if (y & 1)
            res = (res * x) % p;
        y = y >> 1;
        x = (x * x) % p;
    }
    return res;
}
void inti() {
    fact[0] = 1;
    for (int i = 1; i < N; i++){
        fact[i] = (fact[i - 1] % mod * i % mod) % mod;
    }
    for (int i = 0; i < N; i++){
        inv[i] = power(fact[i], mod - 2, mod);
    }
}
int nCr(int n, int r){
    return (fact[n] % mod * inv[n - r] % mod * inv[r] % mod) % mod;
}
 
int32_t main() {
  ios::sync_with_stdio(false);
  cin.tie(0);
  int tt;
  inti();
  cin >> tt;
  assert(tt <= 100000);
  int sum = 0;
  while(tt--) {
    int n;
    cin >> n;
    sum += n;
    vector<int> a(n);
    for(int i = 0; i < n; i++) {
      cin >> a[i];
      assert(a[i] <= n);
    }
    int ans = 0;
    sort(a.begin(), a.end());
    for(int i = 0; i < n; i++) {
      if(a[i] <= i + 1) {
        int right = power(2, n - i - 1, mod);
        int left = nCr(i, a[i] - 1);
        ans += (left * right) % mod;
        ans %= mod;
      } 
    }
    assert(ans < mod);
    cout << ans << '\n';
  }
  assert(sum <= 200000);
}
Tester's code (C++)
#pragma GCC optimisation("O3")
#pragma GCC target("avx,avx2,fma")
#pragma GCC optimize("Ofast,unroll-loops")
#include<bits/stdc++.h>
using namespace std;
typedef unsigned long long ull;
typedef long long ll;
#define NUM1 1000000007LL
#define all(a) a.begin(), a.end()
#define beg(a) a.begin(), a.begin()
#define sq(a) ((a)*(a))
#define NUM2 998244353LL
#define MOD NUM1
#define LMOD 1000000006LL
#define fi first
#define se second
typedef long double ld;
const ll MAX = 100010;
const ll MAX2 = MAX;
const ll large = 1e18;
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());


//public free to use template by bqi343 on github
struct mi {
 	int v; explicit operator int() const { return v; } 
	mi():v(0) {}
	mi(ll _v):v(int(_v%MOD)) { v += (v<0)*MOD; }
};
mi& operator+=(mi& a, mi b) { 
	if ((a.v += b.v) >= MOD) a.v -= MOD; 
	return a; }
mi& operator-=(mi& a, mi b) { 
	if ((a.v -= b.v) < 0) a.v += MOD; 
	return a; }
mi operator+(mi a, mi b) { return a += b; }
mi operator-(mi a, mi b) { return a -= b; }
mi operator*(mi a, mi b) { return mi((ll)a.v*b.v); }
mi& operator*=(mi& a, mi b) { return a = a*b; }
mi pow(mi a, ll p) { assert(p >= 0); // won't work for negative p
	return p==0?1:pow(a*a,p/2)*(p&1?a:1); }
mi inv(mi a) { assert(a.v != 0); return pow(a,MOD-2); }
mi operator/(mi a, mi b) { return a*inv(b); }
bool operator==(mi a, mi b) { return a.v == b.v; }
bool operator==(mi a, int b){ return a.v == b;}
ostream& operator<<(ostream& os, const mi& val)
{
    os << (int) val;
    return os;
}


signed main(){
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    cout.tie(0);
    vector<mi> fact(MAX);
    fact[0] = 1;
    for(int i = 1; i < MAX; i++) fact[i] = i*fact[i - 1];
    vector<mi> invfact(MAX);
    invfact.back() = inv(fact.back());
    for(int i = MAX - 2; i >= 0; i--) invfact[i] = (i + 1)*invfact[i + 1];
    auto ncr = [&](int n, int r){
        if(n < r || r < 0) return mi(0);
        return fact[n]*invfact[r]*invfact[n - r];
    };
    int t;
    cin >> t;
    while(t--){
        int n;
        cin >> n;
        vector<int> a(n);
        for(auto& x: a) cin >> x;
        sort(all(a));
        mi val = 0;
        for(int i = 0; i < n; i++){
            val += ncr(i, a[i] - 1)*pow(mi(2), n - i - 1);
        }
        cout << val << '\n';
    }
    return 0;
}  
Editorialist's code (Python)
mod = 10**9 + 7
MX = 100005
fac = [1]
invfac = [1]
for i in range(1, MX):
	fac.append(i * fac[i-1] % mod)
	invfac.append(pow(fac[-1], mod-2, mod))
def C(n, r):
	if n < r or r < 0: return 0
	return fac[n] * invfac[r] % mod * invfac[n-r] % mod

for _ in range(int(input())):
    n = int(input())
    a = list(map(int, input().split()))
    a.sort()
    ans = 0
    for i in range(n):
        # choose a[i]-1 elements from the left, and anything from the right
        rt = pow(2, n-i-1, mod)
        lt = C(i, a[i]-1)
        ans += lt * rt % mod
    print(ans % mod)
3 Likes

Carefully knit question, an excellent one.

1 Like

Really good question, enjoyed solving it

I am trying to do it with dp but I am doing something wrong pls help!

//सर्वं खल्विदं ब्रह्मम् .....
 
#include "bits/stdc++.h"
using namespace std;
#define int long long
#define pb push_back;
#define lst vector<int>
#define max_heap priority_queue<int>
#define min_heap priority_queue <int, vector<int>, greater<int> >
#define ff first
#define ss second
#define all(v) (v).begin(),(v).end()
#define nln cout<<'\n'
#define FOR(i,n) for(int i=0;i<n;i++)
string no="NO",yes="YES";
 
inline void inArr(int A[],int n){for(int i=0;i<n;i++)cin>>A[i];}
inline void inArr(lst &A){for(int &i:A)cin>>i;}
void noEND(int x){
    
}
int M=1e9+7;
int factorial[100001] = {1};
int inverse_factorial[100001]={1};
int modInverse(int A,int m=M){
    int x0 = m;
    int y = 0, x = 1;
    if (m == 1)return 0;
    while (A > 1){
        int q = A / m;
        int t = m;
        m = A % m, A = t;
        t = y;
        y = x - q * y;
        x = t;
    }
    if (x < 0)x += x0;
    return x;
}
int fc(int n,int k){
    if(k>n)return 0ll;
    if(k==n)return 1ll;
    // for(int i=n-k+1;i<n;i++)r=r*i%M;
    return factorial[n] * inverse_factorial[k] % M * inverse_factorial[n - k] % M;
}

//auto solve(){
void solve(){
    int n;cin>>n;
    lst A(n);
    inArr(A);
    sort(all(A));
    lst dp (n,0);
    if(A[0]==1)dp[0]=1;
    for(int i=1;i<n;i++){
        dp[i]=(2ll*dp[i-1]%M+fc(i,A[i]-1))%M;
    }
    cout<<dp[n-1];nln;
}
signed main(){
    factorial[0] = 1;
    for (int i = 1; i <= 100001; i++) {
        factorial[i] = factorial[i - 1] * i % M;
        inverse_factorial[i]=modInverse( factorial[i] );
    }
    #ifdef localycompiled
        freopen("input.txt","r",stdin);
        freopen("output.txt","w",stdout);
    #endif
    ios_base::sync_with_stdio(0);
    cin.tie(0); cout.tie(0);
    int t=1,x=1;
    cin>>t;
    while(t--)solve();
    // while(t--)cout<<solve()<<'\n';
    // while(cin>>x)noEND(x);        
}
 
//मम् न शरीरमस्मि , न च चित्तास्मि ॥

https://www.codechef.com/viewsolution/96540425

I think the solution is wrong, because

There should be exactly x-1 elements less than x, to be placed at indices 1,2,...,x-1

x-1 is the maximum count of elements that can be before x.
The solution neglects the possibility that we can also take less than x-1 elements.

Due to this, the provided solution fails multiple testcases, the simplest being [2, 2].
The code provided in the editorial spits out 1 as the answer instead of 4 when given this testcase. In fact, it gives 1 for all testcases where all the elements of the array are equal to N.

The fix would be to not just pick x-1 elements from the left, but pick 0 through x-1 elements.
Meaning, instead of computing {L \choose x-1} for elements on the left, we should compute \sum\limits_{i=0}^{x-1} {L \choose i}

I believe you’ve misread (or misunderstood) the problem.

The goodness of subsequence S is the number of indices such that S_i = i after S_i is sorted.
For the example you provided, A = [2, 2]:

  • The subsequence S = [2] has a goodness of zero, because S_1 = 2 \neq 1.
    This subsequence appears twice.
  • The subsequence S = [2, 2] has a goodness of 1, because S_1 = 2 \neq 1, and S_2 = 2.

The sum of goodness across all non-empty subsequences is thus 0+0+1 = 1.
The same analysis should tell you that if all elements of the array are equal to N, the answer is 1 because any subsequence of size \lt N will have a goodness of 0, and the single subsequence of length N will have a goodness of 1 (since S_N = N).