MEXUM - Editorial

taran_1407 · April 25, 2020, 1:58pm

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2

Setter: Jubayer Nirjhor
Tester: Teja Vardhan Reddy
Editorialist: Taranpreet Singh

DIFFICULTY:

Easy

PREREQUISITES:

Combinatorics and fundamental counting principle

PROBLEM:

Given an array A of N positive integers, we define f(S) as the mex of subsequence S. Find the sum of f(S) over all subsequences of A.

QUICK EXPLANATION

The mex of a subsequence of length N cannot exceed N. So we can replace all elements greater than N with N.
Fixing the value of mex, let’s try to count the number of subsequences with specific mex.
For a given mex, we need each positive value smaller than mex to be present at least once, value mex shouldn’t be present, and values greater than mex do not affect us.
If f_x denote the frequency of value x, the number of non-empty subsets of these f_x elements is 2^{f_x}-1. We need to select one of the subsets for each x less than current mex, giving us \displaystyle\prod_{x = 1}^{mex-1} 2^{f_x} -1
To consider elements greater than mex, we can take any subset of those. If there are y elements greater than mex, it contributes 2^y subsets for each choice of subsets of smaller values.

EXPLANATION

First thoughts tell us that mex of an array of length N cannot exceed N+1. The proof is easy, just try to build an array with mex N+2

Let’s try fixing the value of mex, say M such that 0 \leq M \leq N is the current mex, and trying to count the number of subsequences with mex M. Say the number of subsequences is A, it contributes M*A to the sum of mex of subsequences.

Now, for mex M, we need each value 1 \leq x < M to be present at least once. Let f_x denote the frequency of x in A.

For each x to be present once, we can visualize it as a set with f_x elements and we need to count the number of ways to select a non-empty subset, which we know, is 2^{f_x}-1

Since the choice of subset for each x is independent of other x, by fundamental counting principle, the number of ways of choosing values smaller than M such that each value is present at least once is \displaystyle\prod_{x = 1}^{M-1} 2^{f_i}-1

Now, we know that M cannot appear in subsequence. Suppose g_M denotes the number of elements greater than M, each choice of a subset of g_M elements is valid. We get 2^{g_M} choices.

Hence, the final answer can be written as \displaystyle\sum_{M = 1}^{N+1} M*2^{g_M} *\prod_{x = 1}^{M-1} 2^{f_i}-1

For implementation, sorting the elements would make things easier. Refer to my code for more details.

TIME COMPLEXITY

The time complexity is O(N*log(N)) per test case due to sorting as well as computing powers.

SOLUTIONS:

Setter's Solution

#include <bits/stdc++.h>

using namespace std;

typedef long long ll;

const int N = 200010;
const int MOD = 998244353;

int t, n; 
ll two[N], a[N], freq[N], sf[N];

int main() {
  two[0] = 1;
  for (int i = 1; i < N; ++i) two[i] = 2 * two[i - 1] % MOD;
  cin >> t;
  while (t--) {
	scanf("%d", &n);
	for (int i = 0; i < n; ++i) {
	  scanf("%lld", a + i);
	  ++freq[min(a[i], n + 1LL)];
	}
	sf[n + 69] = 0;
	for (int i = n + 68; i; --i) sf[i] = sf[i + 1] + freq[i];
	ll ans = 0, pf = 1;
	for (ll i = 1; i <= n + 1; ++i) {
	  ans = (ans + i * (pf * two[sf[i + 1]] % MOD)) % MOD;
	  pf = pf * (two[freq[i]] - 1) % MOD;
	}
	ans += MOD, ans %= MOD;
	printf("%lld\n", ans);
	for (int i = 0; i <= n + 69; ++i) freq[i] = 0;
  }
  return 0;
}

Tester's Solution

//teja349
#include <bits/stdc++.h>
#include <vector>
#include <set>
#include <map>
#include <string>
#include <cstdio>
#include <cstdlib>
#include <climits>
#include <utility>
#include <algorithm>
#include <cmath>
#include <queue>
#include <stack>
#include <iomanip>
//setbase - cout << setbase (16); cout << 100 << endl; Prints 64
//setfill -   cout << setfill ('x') << setw (5); cout << 77 << endl; prints xxx77
//setprecision - cout << setprecision (14) << f << endl; Prints x.xxxx
//cout.precision(x)  cout<<fixed<<val;  // prints x digits after decimal in val
 
using namespace std; 
#define f(i,a,b) for(i=a;i<b;i++)
#define rep(i,n) f(i,0,n)
#define fd(i,a,b) for(i=a;i>=b;i--)
#define pb push_back
#define mp make_pair
#define vi vector< int >
#define vl vector< ll >
#define ss second
#define ff first
#define ll long long
#define pii pair< int,int >
#define pll pair< ll,ll >
#define inf (1000*1000*1000+5)
#define all(a) a.begin(),a.end()
#define tri pair<int,pii>
#define vii vector<pii>
#define vll vector<pll>
#define viii vector<tri>
#define mod (998244353)
#define pqueue priority_queue< int >
#define pdqueue priority_queue< int,vi ,greater< int > >
#define flush fflush(stdout) 
#define primeDEN 727999983
#define int ll 
 
int a[123456],two[123456];
map<int,int> mapi; 
signed main(){
	std::ios::sync_with_stdio(false); cin.tie(NULL);
	int i;
	two[0]=1;
	f(i,1,123456){
		two[i]=two[i-1]*2;
		two[i]%=mod;
	}

	int t;
	cin>>t;
	while(t--){
		mapi.clear();
		int n;
		cin>>n;
		int i,maxi=0;
		rep(i,n){
			cin>>a[i];
			mapi[a[i]]++;
			maxi=max(maxi,a[i]);
		}
		mapi[maxi+1]=0;
	
		int previ=0,sofar=0;
		int befor=1,ways,tot=0;
		map<int,int>::iterator it;
		for(it=mapi.begin();it!=mapi.end();it++){
			if(it->ff==previ+1){
				ways=two[n-sofar-(it->ss)];
				ways*=befor;
				ways%=mod;
				ways*=(it->ff);
				ways%=mod;
				tot+=ways;
				tot%=mod;
				previ=it->ff;
			}
			else{
				ways=two[n-sofar];
				ways*=befor;
				ways%=mod;
				ways*=(previ+1);
				ways%=mod;
				tot+=ways;
				tot%=mod;
				break;
			}
			sofar+=it->ss;
			befor*=(two[it->ss]-1);
			befor%=mod;
		}
		cout<<tot<<endl;
	}
	
}

Editorialist's Solution

import java.util.*;
import java.io.*;
import java.text.*;
class MEXUM{
	//SOLUTION BEGIN
	long MOD = 998244353;
	void pre() throws Exception{}
	void solve(int TC) throws Exception{
	    int n = ni();
	    long[] a = new long[n];
	    for(int i = 0; i< n; i++)a[i] = nl();
	    Arrays.sort(a);
	    long ans = 0;
	    long ways = 1;
	    int prev = 0;int idx = 0;
	    for(int mex = 1; mex <= n; mex++){
	        while(prev+1 < mex){
	            int cnt = 0;
	            while(idx < n && a[idx] == prev+1){
	                idx++;
	                cnt++;
	            }
	            ways = mul(ways, add(pow(2, cnt), MOD-1));
	            prev++;
	        }
	        int pos = idx;
	        for(int p = idx; p < n && a[p] == mex; p++)pos++;
	        ans = add(ans, mul(mex, ways, pow(2, n-pos)));
	    }
	    pn(ans);
	}
	long pow(long a, long p){
	    long o = 1;
	    for(; p>0; p>>=1){
	        if((p&1)==1)o = mul(o, a);
	        a = mul(a, a);
	    }
	    return o;
	}
	long add(long... a){
	    long o = 0;
	    for(long x:a)o = (o+MOD+x)%MOD;
	    return o;
	}
	long mul(long... a){
	    long p = 1;
	    for(long x:a)p = (MOD+(p*x)%MOD)%MOD;
	    return p;
	}
	//SOLUTION END
	void hold(boolean b)throws Exception{if(!b)throw new Exception("Hold right there, Sparky!");}
	DecimalFormat df = new DecimalFormat("0.00000000000");
	static boolean multipleTC = true;
	FastReader in;PrintWriter out;
	void run() throws Exception{
	    in = new FastReader();
	    out = new PrintWriter(System.out);
	    //Solution Credits: Taranpreet Singh
	    int T = (multipleTC)?ni():1;
	    pre();for(int t = 1; t<= T; t++)solve(t);
	    out.flush();
	    out.close();
	}
	public static void main(String[] args) throws Exception{
	    new MEXUM().run();
	}
	int bit(long n){return (n==0)?0:(1+bit(n&(n-1)));}
	void p(Object o){out.print(o);}
	void pn(Object o){out.println(o);}
	void pni(Object o){out.println(o);out.flush();}
	String n()throws Exception{return in.next();}
	String nln()throws Exception{return in.nextLine();}
	int ni()throws Exception{return Integer.parseInt(in.next());}
	long nl()throws Exception{return Long.parseLong(in.next());}
	double nd()throws Exception{return Double.parseDouble(in.next());}

	class FastReader{
	    BufferedReader br;
	    StringTokenizer st;
	    public FastReader(){
	        br = new BufferedReader(new InputStreamReader(System.in));
	    }

	    public FastReader(String s) throws Exception{
	        br = new BufferedReader(new FileReader(s));
	    }

	    String next() throws Exception{
	        while (st == null || !st.hasMoreElements()){
	            try{
	                st = new StringTokenizer(br.readLine());
	            }catch (IOException  e){
	                throw new Exception(e.toString());
	            }
	        }
	        return st.nextToken();
	    }

	    String nextLine() throws Exception{
	        String str = "";
	        try{   
	            str = br.readLine();
	        }catch (IOException e){
	            throw new Exception(e.toString());
	        }  
	        return str;
	    }
	}
}

Feel free to share your approach. Suggestions are welcomed as always.

anon52644508 · April 25, 2020, 6:39pm

can some one help me : i have done same thing in my code but giving wrong answer.

#include<bits/stdc++.h>
using namespace std;
#define ll long long
#define MOD 998244353
#define IOS std::ios::sync_with_stdio(false); cin.tie(NULL); cout.tie(NULL)

const ll N = 1e5+10;
ll a[N];
ll coun[N];
ll pref[N],mul[N],p2[N];

int main()
{

    ll t,n,i,j,ans,val,temp,res,sum,v,mx;

    p2[0]=1;
    for(i=1;i<N;++i)
        p2[i] = (p2[i-1]*2)%MOD; //calculation of powers of 2

    cin>>t;

    while(t--)
    {
        cin>>n;

        for(i=0;i<=(n+2);++i)
                coun[i]=pref[i]=mul[i]=0;
        for(i=1;i<=n;++i)
        {
                cin>>a[i];
                if(a[i]<=(n+1))
                    ++coun[a[i]]; //saving the frequency
        }
    


        sum = 0;
        res = 0;
        //cout<<n<<"\n";

        for(i=1;i<=(n+1);++i)
        {
            pref[i] = (pref[i-1] + coun[i])%MOD;
            

            v = ((n - pref[i])%MOD)%MOD;

            


            v = p2[v];


            if(i==1)
            {
                res = (res+v)%MOD;
                mul[i] = (p2[coun[i]]-1)%MOD;
            }
            else
            {
                res =  (res + (v*(mul[i-1]*i)%MOD)%MOD)%MOD ;
                v = (p2[coun[i]]-1)%MOD;
                mul[i] = (v*mul[i-1])%MOD;  
            }


        }

        res = (res+MOD)%MOD;


        cout<<res<<"\n";


    }






    return 0;
}

abhishek_0706 · April 25, 2020, 7:05pm

can someone tell me why this solution is wrong? it is giving ac for one subtask.

https://www.codechef.com/viewsolution/32283888

ferrarixxx · April 25, 2020, 7:15pm

Hi
I have implemented with the same approach

BTW Nice Implementation Question

Can Anyone please give me a test case where my code will fail ??

AM GETTING AC ONE SUBTASK

Links to my code :
https://www.codechef.com/viewsolution/32284716

Try this test case
1
20
1 1 1 2 2 2 3 3 3 6 7 7 7 8 9 9 9 10 10 10

Then this :
1
70
1 1 1 2 2 2 3 3 3 6 7 7 7 8 8 8 9 9 9 10 10 10 1 1 2 2 3 3 4 4 5 5 1 1 2 2 3 3 4 4 30 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6

My code gives correct ans
But WA for this case
1
160
1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23 24 24 25 25 26 26 27 27 28 28 29 29 30 30 31 31 32 32 33 33 34 34 35 35 36 36 37 37 38 38 39 39 40 40 41 41 42 42 43 43 44 44 45 45 46 46 47 47 48 48 49 49 50 50 51 51 52 52 53 53 54 54 55 55 56 56 57 57 58 58 59 59 60 60 61 61 62 62 63 63 64 64 65 65 66 66 67 67 68 68 69 69 70 70 71 71 72 72 73 73 74 74 75 75 76 76 77 77 78 78 79 79 80 80

abhishek_0706 · April 25, 2020, 7:17pm

I got the error. I just forgot to take mod m in line 45.

ferrarixxx · April 25, 2020, 7:31pm

I am getting WA whenever the answer exceeds the modulo value

its giving correct ans for all values less then mod value

i have put modulo everywhere then why am i not getting correct answer??

Try any test case in which ans exceeds modulo
And then please see whats wrong in my code wrt modulo

Please help guys

tomkasetty · April 26, 2020, 2:34am

@taran_1407 I have doubt with your approach that all the values of M from 1 to n + 1 contribute to the total count every single time. Consider the following sequence - 1,1, 2, 2, 3, 3, 4, 4. Here, n = 8. Under your logic, we would add some positive count to the answer with M = 9 even when there is no scenario where f(S) or M \gt 5.
Also consider the following sequence - 1, 1, 2, 2, 5, 5, 6, 6. Here, under the current logic 2 values of M i.e 3 and 4 would contribute equally to the total count. In which, the contribution of M = 4 is completely invalid because there exists no S s.t f(S) = 4.
I might be wrong. Please clarify my doubts. Thank You.

Edit: Late realisation - when M = 4, 2^{f_{3}} - 1 = 0. Same logic applies to the first scenario. So their contributions are actually 0. Kudos, really good solution.

taran_1407 · April 26, 2020, 2:52am

You are right, no value of M > 5 contribute to answer for first sequence, since for M > 5, the product for values less than M also contains 2^{f_5}-1 = 2^0-1 = 0 implying the whole product becomes zero.

anu_105 · April 26, 2020, 4:43am

@taran_1407
Since the choice of subset for each xx is independent of other xx, by fundamental counting principle, the number of ways of choosing values smaller than MM such that each value is present at least once is \displaystyle\prod_{x = 1}^{M-1} 2^{f_i}-1
x=1
∏ 2fi -1.
M−1

This statement and next to this statement I am not able to understand please take some example so that I can Understand.

blackbeard1 · April 26, 2020, 5:54am

This is my submission during contest - CodeChef: Practical coding for everyone

I am storing the count of each element in cnt map. And i need to break out of loop if i found a element which is not in map.
So if i use the find function to check if an element is in cnt map, it shows TLE even on sample test case.
Can anyone explain , why this is happening?

sparkie · April 26, 2020, 6:41am

res = (res + (v*((mul[i-1]*i)%MOD))%MOD)%MOD ;

use this…

you are getting error because of % and * Operator associativity.

burst101 · April 26, 2020, 6:54am

what are these values 68 and 69 in setter’s solution?

bharath_001 · April 26, 2020, 7:03am

look at these solutions :-
https://www.codechef.com/viewsolution/32279342
https://www.codechef.com/viewsolution/32282055

one got wrong answer and other got ac with one mmajor change which was dividing the main statement sum = sum%m + (prev%m * (po(2,umap[j])%m*j%m)%m)%m into smaller parts in second submission

ferrarixxx · April 26, 2020, 7:32am

I have done that still no change in verdict
https://www.codechef.com/viewsolution/32296438

This code works fine for ans’s that go beyond modulo so i dont know what the problem is.

PS. CODECHEF SHOULD START RELEASING TESTCASES JUST LIKE CODEFORCES ,
IT IS REALLY FRUSTRATING WHEN THE CODE BEHAVES LIKE THIS AND THERES NOT MUCH WE CAN DO

everule1 · April 26, 2020, 7:34am

When you do

(a*(b*c)%m)%m

letd=b*c

(a*d % m)%m

let e=a*d

(e%m)%m;

ferrarixxx · April 26, 2020, 7:36am

can u help me find mistake in my code above ??
my logic is correct
It passed on subtask but is failing others
Please ??

sarodesparsh · April 26, 2020, 8:00am

On line 314,

add this extra mod as follows :
l[i]=( l[i-1]*( powm(2,cont.ss,mod) - 1 + mod )%mod )%mod;

instead of :
l[i]=( l[i-1]*( powm(2,cont.ss,mod) - 1 )%mod )%mod;

That might be a problem

axelblaze0510 · April 26, 2020, 8:23am

Hey!I am getting right for most of the test cases.dont know where im going wrong…can someone suggest what i have to rectify in this code .
I have initialised mul to compute the set containing all previous elements and count is my frequency array

please check this link for the code and tell me what is wrong

manoj9april · April 26, 2020, 8:58am

why my solution(link) is giving WA.

I see no diff in my sol and this(link)

zeus0106 · April 26, 2020, 9:12am

This piece of code is running on my system. But is giving SIGABRT error on CodeChef IDE. Can someone explain why?

Here’s a link to my solution:

https://www.codechef.com/viewsolution/32278975