LENGAME - Editorial

ajit123q · October 11, 2021, 8:04pm

PROBLEM LINK:

Practice
Div1
Div2
Div3

Setter: Tejas Pandey
Tester: Samarth Gupta
Editorialist: Ajit Sharma Kasturi

DIFFICULTY:

MEDIUM

PREREQUISITES:

Dynamic programming, Expected value, Modular multiplicative inverse

PROBLEM:

We are given an array of length N and it is shuffled. In each turn Alice picks an index which is not chosen yet and shows it to Bob. If the number in that index is larger than all the numbers taken till now then Bob adds that number to his array. We need to find out the expected length of Bob’s array modulo 998244353.

QUICK EXPLANATION:

Let us first sort the array. Let dp[i] denote the expected length of Bob’s array if the first chosen index is i.
This can be handled in two cases. If A_i = A_i+1, then dp[i] = dp[i+1] else dp[i] = 1 + \frac{dp[i+1] +dp[i+2]+\dots +dp[N]}{N-i} .
Since we can pick every starting index i with probability \frac{1}{N}, the final answer will be \frac{dp[1]+dp[2]+\dots+dp[N]}{N} .

EXPLANATION:

First let us sort the array as it makes our approach simpler and doesn’t change the final solution of the problem. This problem can be solved by using dynamic programming. For this we have to define a dp state. Let dp[i] denote the expected length of Bob’s array if the first chosen index is i.

The base case is dp[N] =1 since all the remaining elements are \leq A_N and thus cannot be further added to Bob’s array.

Now let us iterate from N-1 to 1 and calculate the dp states.

Let us suppose we are at some index i. Now we need to handle the following two cases:

Case 1: A_i = A_{i+1}

In this case, the number of elements greater than A_i is the same as that for A_{i+1} and thus the possible final Bob’s array states must also be same. Therefore, dp_i = dp_{i+1} .

Case 2: A_i \neq A_{i+1}

In this case suppose the first element we pick is A_i. The number of elements greater than A_i is N-i.
Therefore, for the next element, we could pick index {i+1} with probability \frac{1}{N-i}, index {i+2} with probability \frac{1}{N-i}, \dots, index {N} with probability \frac{1}{N-i} .
Thus, dp[i] = 1 + \frac{dp[i+1] +dp[i+2]+\dots +dp[N]}{N-i} .

Once we have our dp values, our final answer will be \frac{dp[1]+dp[2]+\dots+dp[N]}{N} . This is because we can pick the starting index as index 1 with probability \frac{1}{N}, index 2 with probability \frac{1}{N}, \dots, index N with probability \frac{1}{N} .

TIME COMPLEXITY:

O(N\log N) or O(N\log N + N\log MOD) per testcase depending on the implementation for calculating modular multiplicative inverse.

SOLUTION:

Editorialist's solution


#include <bits/stdc++.h>
#define int long long int
using namespace std;

int power(int x, int y, int MOD)
{
     if (y == 0)
          return 1;

     int temp = power(x, y / 2, MOD);
     temp = (temp * temp) % MOD;

     if (y & 1)
          temp = (temp * x) % MOD;
     return temp;
}

int calcInverse(int x, int MOD)
{
     return power(x, MOD - 2, MOD);
}

int32_t main()
{
     int tests;
     cin >> tests;

     while (tests--)
     {
          int MOD = 998244353;

          int n;
          cin >> n;

          vector<int> a(n + 1);
          for (int i = 1; i <= n; i++)
               cin >> a[i];

          sort(a.begin(), a.end());

          vector<int> dp(n + 1);
          int sum = 0;

          for (int i = n; i >= 1; i--)
          {
               if (i < n && a[i] == a[i + 1])
                    dp[i] = dp[i + 1];
               else
                    dp[i] = (1 + sum * calcInverse(n - i, MOD)) % MOD;
               sum += dp[i];
               sum %= MOD;
          }

          int ans = (sum * calcInverse(n, MOD)) % MOD;

          cout << ans << endl;
     }

     return 0;
}

Setter's solution

#include <bits/stdc++.h>
#define mod 998244353
#define maxn 500007
#define ll long long int
using namespace std;

ll inv[maxn];
long long dp[maxn];
int val[maxn];

ll mpow(ll a, ll b){
	ll res = 1;
	while(b){
		if(b&1) res *= a,res %= mod;
		a *= a;
		a %= mod;
		b >>= 1;
	}
	return res;
}

void pre(){
    inv[0] = inv[1] = 1;
	for(int i = 2;i < maxn;i++){
		inv[i] = mpow(i, mod - 2);
	}
}


int main() {
    //freopen("inp8.in", "r", stdin);
    //freopen("out8.out", "w", stdout);
    pre();
    int t;
    cin >> t;
    assert(t > 0 && t <= 100000);
    while(t--) {
        int n;
        cin >> n;
        assert(n > 0 && n <= 500000);
        for(int i = 0; i < n; i++) {
            cin >> val[i];
            assert(val[i] > 0 && val[i] <= 500000);
        }
        sort(val, val + n);
        reverse(val, val + n);
        dp[0] = 1;
        long long sm = 1;
        for(int i = 1; i < n; i++) {
            if(val[i] == val[i - 1])
                dp[i] = dp[i - 1];
            else
                dp[i] = (1 + (sm*inv[i])%mod)%mod;
            sm += dp[i];
            sm %= mod;
        }
        cout << ((sm*inv[n])%mod) << "\n";
    }
}

Tester's solution

#include <bits/stdc++.h>
using namespace std;

long long readInt(long long l, long long r, char endd) {
    long long x=0;
    int cnt=0;
    int fi=-1;
    bool is_neg=false;
    while(true) {
        char g=getchar();
        if(g=='-') {
            assert(fi==-1);
            is_neg=true;
            continue;
        }
        if('0'<=g&&g<='9') {
            x*=10;
            x+=g-'0';
            if(cnt==0) {
                fi=g-'0';
            }
            cnt++;
            assert(fi!=0 || cnt==1);
            assert(fi!=0 || is_neg==false);
 
            assert(!(cnt>19 || ( cnt==19 && fi>1) ));
        } else if(g==endd) {
            if(is_neg) {
                x=-x;
            }
            assert(l<=x&&x<=r);
            return x;
        } else {
            assert(false);
        }
    }
}
string readString(int l, int r, char endd) {
    string ret="";
    int cnt=0;
    while(true) {
        char g=getchar();
        assert(g!=-1);
        if(g==endd) {
            break;
        }
        cnt++;
        ret+=g;
    }
    assert(l<=cnt&&cnt<=r);
    return ret;
}
long long readIntSp(long long l, long long r) {
    return readInt(l,r,' ');
}
long long readIntLn(long long l, long long r) {
    return readInt(l,r,'\n');
}
string readStringLn(int l, int r) {
    return readString(l,r,'\n');
}
string readStringSp(int l, int r) {
    return readString(l,r,' ');
}
 
void readEOF(){
    assert(getchar()==EOF);
}
#define mod 998244353
int ksm(int a, int b){
    int ans = 1;
    while(b){
        if(b%2)
            ans = ans*1ll*a%mod;
        b = b/2;
        a = a*1ll*a%mod;
    }
    return ans;
}
int inv(int a){
    return ksm(a, mod - 2);
}
int main() {
	// your code goes here
	int t = readIntLn(1, 2e4);
	int sum = 0;
	while(t--){
	    int n = readIntLn(1, 5e5);
	    sum += n;
	    assert(sum <= 5e5);
	    map<int, int> mp;
	    for(int i = 1 ; i <= n ; i++){
	        int x;
	        if(i == n)
	            x = readIntLn(1, 5e5);
	        else
	            x = readIntSp(1, 5e5);
	        mp[x]++;
	    }
	    vector<int> vec;
	    for(auto j : mp)
	        vec.push_back(j.second);
	    int m = vec.size();
	    int dp[m];
	    dp[m - 1] = 1;
	    long long sum = dp[m - 1]*vec[m - 1];
	    int len = vec[m - 1];
	    for(int i = m - 2 ; i >= 0 ; i--){
	        dp[i] = (1 + sum*1ll*inv(len)%mod)%mod;
	        len += vec[i];
	        sum += dp[i]*1ll*vec[i]%mod;
	        sum %= mod;
	    }
	    sum = sum*1ll*inv(n)%mod;
	    cout << sum << '\n';
	}
	readEOF();
	return 0;
}

Please comment below if you have any questions, alternate solutions, or suggestions.

sumit_kp · October 12, 2021, 6:42pm

I think for the case when A[i]!=A[i+1], there is a typo it should be dp[i]= 1+ (that_sum)/(N-i)

hackerbhaiya · October 12, 2021, 8:47pm

In case 2, dp[i] = 1+(dp[i+1]+dp[i+2]+…+dp[N])/(N-i), please correct it.

zergtricky · October 12, 2021, 8:50pm

I had another solution:
First of all let’s sort the array. I will use zero indexing.
Due to linearity of expected value we can say that E(A) = \sum_{i=0}^{n-1} E(A_i).
Where E(A_i) is the probability that we will take A_i to our array.
E(A_i) = \frac{\sum_{k=0}^{cnt} C_{cnt}^{k}k!(n-k-1)!}{n!} where cnt is amount of number less than A_i (we can easily find cnt using lower_bound). This is because we iterate position where we can put A_i and for fixed position k we can choose C_{cnt}^{k} indexes to put to the left, for left we have k! permutations and for right we have (n - k - 1)! permutations.
It is possibly to transform our top sum to \frac{n!}{n-cnt} (I honestly have no prove for this, I found it out using Wolfram Alpha, would be glad, if someone could prove this equality). So we need to sum up all E(A_i).

ajit123q · October 13, 2021, 2:34am

Thanks. Corrected!

juang · October 13, 2021, 8:18am

Why time complexity is not O(nlogn) since you have used sort?

vyom47 · October 13, 2021, 4:53pm

Would the solution remain the same if we did not shuffle the array initially?

renauga · October 13, 2021, 4:54pm

if you expand the terms and take the terms independent of k, out of the summation, you will end up with (n-k-1)!/(cnt-k)!, which is a series which is sum of constant number of consecutive integers, where the smallest integer varies. For example, consider 1x2x 3 + 2x3x4 + 3x4x5… x(x+1)(x+2). To solve this, multiple and divide each term by 4. So each term becomes, (x(x+1)(x+2)(4))/4. The 4 in the numerator can be written as (x+3)-(x-1), and lets ignore the 4 in the denominator for now as it can be accounted for later. This is the difference of the term that would come just after the largest term in the product, and the term just before the smallest one. So, each term is (x(x+1)(x+2)((x+3)-(x-1)). Now open the last bracket. x (x+1)(x+2)(x+3)- (x-1)(x)(x+1)(x+2). This is a telescoping sum, so, easy to calculate. This can be done for any number of consecutive integer products, and u will end up with the function in your answer.

ajit123q · October 14, 2021, 5:28am

Yep. Thanks for mentioning. Will fix that.