# MEXUM - Editorial

Setter: Jubayer Nirjhor
Tester: Teja Vardhan Reddy
Editorialist: Taranpreet Singh

Easy

# PREREQUISITES:

Combinatorics and fundamental counting principle

# PROBLEM:

Given an array A of N positive integers, we define f(S) as the mex of subsequence S. Find the sum of f(S) over all subsequences of A.

# QUICK EXPLANATION

• The mex of a subsequence of length N cannot exceed N. So we can replace all elements greater than N with N.
• Fixing the value of mex, let’s try to count the number of subsequences with specific mex.
• For a given mex, we need each positive value smaller than mex to be present at least once, value mex shouldn’t be present, and values greater than mex do not affect us.
• If f_x denote the frequency of value x, the number of non-empty subsets of these f_x elements is 2^{f_x}-1. We need to select one of the subsets for each x less than current mex, giving us \displaystyle\prod_{x = 1}^{mex-1} 2^{f_x} -1
• To consider elements greater than mex, we can take any subset of those. If there are y elements greater than mex, it contributes 2^y subsets for each choice of subsets of smaller values.

# EXPLANATION

First thoughts tell us that mex of an array of length N cannot exceed N+1. The proof is easy, just try to build an array with mex N+2

Let’s try fixing the value of mex, say M such that 0 \leq M \leq N is the current mex, and trying to count the number of subsequences with mex M. Say the number of subsequences is A, it contributes M*A to the sum of mex of subsequences.

Now, for mex M, we need each value 1 \leq x < M to be present at least once. Let f_x denote the frequency of x in A.

For each x to be present once, we can visualize it as a set with f_x elements and we need to count the number of ways to select a non-empty subset, which we know, is 2^{f_x}-1

Since the choice of subset for each x is independent of other x, by fundamental counting principle, the number of ways of choosing values smaller than M such that each value is present at least once is \displaystyle\prod_{x = 1}^{M-1} 2^{f_i}-1

Now, we know that M cannot appear in subsequence. Suppose g_M denotes the number of elements greater than M, each choice of a subset of g_M elements is valid. We get 2^{g_M} choices.

Hence, the final answer can be written as \displaystyle\sum_{M = 1}^{N+1} M*2^{g_M} *\prod_{x = 1}^{M-1} 2^{f_i}-1

For implementation, sorting the elements would make things easier. Refer to my code for more details.

# TIME COMPLEXITY

The time complexity is O(N*log(N)) per test case due to sorting as well as computing powers.

# SOLUTIONS:

Setter's Solution
#include <bits/stdc++.h>

using namespace std;

typedef long long ll;

const int N = 200010;
const int MOD = 998244353;

int t, n;
ll two[N], a[N], freq[N], sf[N];

int main() {
two[0] = 1;
for (int i = 1; i < N; ++i) two[i] = 2 * two[i - 1] % MOD;
cin >> t;
while (t--) {
scanf("%d", &n);
for (int i = 0; i < n; ++i) {
scanf("%lld", a + i);
++freq[min(a[i], n + 1LL)];
}
sf[n + 69] = 0;
for (int i = n + 68; i; --i) sf[i] = sf[i + 1] + freq[i];
ll ans = 0, pf = 1;
for (ll i = 1; i <= n + 1; ++i) {
ans = (ans + i * (pf * two[sf[i + 1]] % MOD)) % MOD;
pf = pf * (two[freq[i]] - 1) % MOD;
}
ans += MOD, ans %= MOD;
printf("%lld\n", ans);
for (int i = 0; i <= n + 69; ++i) freq[i] = 0;
}
return 0;
}

Tester's Solution
//teja349
#include <bits/stdc++.h>
#include <vector>
#include <set>
#include <map>
#include <string>
#include <cstdio>
#include <cstdlib>
#include <climits>
#include <utility>
#include <algorithm>
#include <cmath>
#include <queue>
#include <stack>
#include <iomanip>
//setbase - cout << setbase (16); cout << 100 << endl; Prints 64
//setfill -   cout << setfill ('x') << setw (5); cout << 77 << endl; prints xxx77
//setprecision - cout << setprecision (14) << f << endl; Prints x.xxxx
//cout.precision(x)  cout<<fixed<<val;  // prints x digits after decimal in val

using namespace std;
#define f(i,a,b) for(i=a;i<b;i++)
#define rep(i,n) f(i,0,n)
#define fd(i,a,b) for(i=a;i>=b;i--)
#define pb push_back
#define mp make_pair
#define vi vector< int >
#define vl vector< ll >
#define ss second
#define ff first
#define ll long long
#define pii pair< int,int >
#define pll pair< ll,ll >
#define inf (1000*1000*1000+5)
#define all(a) a.begin(),a.end()
#define tri pair<int,pii>
#define vii vector<pii>
#define vll vector<pll>
#define viii vector<tri>
#define mod (998244353)
#define pqueue priority_queue< int >
#define pdqueue priority_queue< int,vi ,greater< int > >
#define flush fflush(stdout)
#define primeDEN 727999983
#define int ll

int a[123456],two[123456];
map<int,int> mapi;
signed main(){
std::ios::sync_with_stdio(false); cin.tie(NULL);
int i;
two[0]=1;
f(i,1,123456){
two[i]=two[i-1]*2;
two[i]%=mod;
}

int t;
cin>>t;
while(t--){
mapi.clear();
int n;
cin>>n;
int i,maxi=0;
rep(i,n){
cin>>a[i];
mapi[a[i]]++;
maxi=max(maxi,a[i]);
}
mapi[maxi+1]=0;

int previ=0,sofar=0;
int befor=1,ways,tot=0;
map<int,int>::iterator it;
for(it=mapi.begin();it!=mapi.end();it++){
if(it->ff==previ+1){
ways=two[n-sofar-(it->ss)];
ways*=befor;
ways%=mod;
ways*=(it->ff);
ways%=mod;
tot+=ways;
tot%=mod;
previ=it->ff;
}
else{
ways=two[n-sofar];
ways*=befor;
ways%=mod;
ways*=(previ+1);
ways%=mod;
tot+=ways;
tot%=mod;
break;
}
sofar+=it->ss;
befor*=(two[it->ss]-1);
befor%=mod;
}
cout<<tot<<endl;
}

}

Editorialist's Solution
import java.util.*;
import java.io.*;
import java.text.*;
class MEXUM{
//SOLUTION BEGIN
long MOD = 998244353;
void pre() throws Exception{}
void solve(int TC) throws Exception{
int n = ni();
long[] a = new long[n];
for(int i = 0; i< n; i++)a[i] = nl();
Arrays.sort(a);
long ans = 0;
long ways = 1;
int prev = 0;int idx = 0;
for(int mex = 1; mex <= n; mex++){
while(prev+1 < mex){
int cnt = 0;
while(idx < n && a[idx] == prev+1){
idx++;
cnt++;
}
ways = mul(ways, add(pow(2, cnt), MOD-1));
prev++;
}
int pos = idx;
for(int p = idx; p < n && a[p] == mex; p++)pos++;
ans = add(ans, mul(mex, ways, pow(2, n-pos)));
}
pn(ans);
}
long pow(long a, long p){
long o = 1;
for(; p>0; p>>=1){
if((p&1)==1)o = mul(o, a);
a = mul(a, a);
}
return o;
}
long o = 0;
for(long x:a)o = (o+MOD+x)%MOD;
return o;
}
long mul(long... a){
long p = 1;
for(long x:a)p = (MOD+(p*x)%MOD)%MOD;
return p;
}
//SOLUTION END
void hold(boolean b)throws Exception{if(!b)throw new Exception("Hold right there, Sparky!");}
DecimalFormat df = new DecimalFormat("0.00000000000");
static boolean multipleTC = true;
void run() throws Exception{
out = new PrintWriter(System.out);
//Solution Credits: Taranpreet Singh
int T = (multipleTC)?ni():1;
pre();for(int t = 1; t<= T; t++)solve(t);
out.flush();
out.close();
}
public static void main(String[] args) throws Exception{
new MEXUM().run();
}
int bit(long n){return (n==0)?0:(1+bit(n&(n-1)));}
void p(Object o){out.print(o);}
void pn(Object o){out.println(o);}
void pni(Object o){out.println(o);out.flush();}
String n()throws Exception{return in.next();}
String nln()throws Exception{return in.nextLine();}
int ni()throws Exception{return Integer.parseInt(in.next());}
long nl()throws Exception{return Long.parseLong(in.next());}
double nd()throws Exception{return Double.parseDouble(in.next());}

StringTokenizer st;
}

}

String next() throws Exception{
while (st == null || !st.hasMoreElements()){
try{
}catch (IOException  e){
throw new Exception(e.toString());
}
}
return st.nextToken();
}

String nextLine() throws Exception{
String str = "";
try{
}catch (IOException e){
throw new Exception(e.toString());
}
return str;
}
}
}


Feel free to share your approach. Suggestions are welcomed as always.

17 Likes

can some one help me : i have done same thing in my code but giving wrong answer.

#include<bits/stdc++.h>
using namespace std;
#define ll long long
#define MOD 998244353
#define IOS std::ios::sync_with_stdio(false); cin.tie(NULL); cout.tie(NULL)

const ll N = 1e5+10;
ll a[N];
ll coun[N];
ll pref[N],mul[N],p2[N];

int main()
{

ll t,n,i,j,ans,val,temp,res,sum,v,mx;

p2[0]=1;
for(i=1;i<N;++i)
p2[i] = (p2[i-1]*2)%MOD; //calculation of powers of 2

cin>>t;

while(t--)
{
cin>>n;

for(i=0;i<=(n+2);++i)
coun[i]=pref[i]=mul[i]=0;
for(i=1;i<=n;++i)
{
cin>>a[i];
if(a[i]<=(n+1))
++coun[a[i]]; //saving the frequency
}

sum = 0;
res = 0;
//cout<<n<<"\n";

for(i=1;i<=(n+1);++i)
{
pref[i] = (pref[i-1] + coun[i])%MOD;

v = ((n - pref[i])%MOD)%MOD;

v = p2[v];

if(i==1)
{
res = (res+v)%MOD;
mul[i] = (p2[coun[i]]-1)%MOD;
}
else
{
res =  (res + (v*(mul[i-1]*i)%MOD)%MOD)%MOD ;
v = (p2[coun[i]]-1)%MOD;
mul[i] = (v*mul[i-1])%MOD;
}

}

res = (res+MOD)%MOD;

cout<<res<<"\n";

}

return 0;
}


can someone tell me why this solution is wrong? it is giving ac for one subtask.

https://www.codechef.com/viewsolution/32283888

1 Like

Hi
I have implemented with the same approach

BTW Nice Implementation Question

Can Anyone please give me a test case where my code will fail ??

Try this test case
1
20
1 1 1 2 2 2 3 3 3 6 7 7 7 8 9 9 9 10 10 10

Then this :
1
70
1 1 1 2 2 2 3 3 3 6 7 7 7 8 8 8 9 9 9 10 10 10 1 1 2 2 3 3 4 4 5 5 1 1 2 2 3 3 4 4 30 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6

My code gives correct ans
But WA for this case
1
160
1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20 21 21 22 22 23 23 24 24 25 25 26 26 27 27 28 28 29 29 30 30 31 31 32 32 33 33 34 34 35 35 36 36 37 37 38 38 39 39 40 40 41 41 42 42 43 43 44 44 45 45 46 46 47 47 48 48 49 49 50 50 51 51 52 52 53 53 54 54 55 55 56 56 57 57 58 58 59 59 60 60 61 61 62 62 63 63 64 64 65 65 66 66 67 67 68 68 69 69 70 70 71 71 72 72 73 73 74 74 75 75 76 76 77 77 78 78 79 79 80 80

I got the error. I just forgot to take mod m in line 45.

I am getting WA whenever the answer exceeds the modulo value

its giving correct ans for all values less then mod value

i have put modulo everywhere then why am i not getting correct answer??

Try any test case in which ans exceeds modulo
And then please see whats wrong in my code wrt modulo

@taran_1407 I have doubt with your approach that all the values of M from 1 to n + 1 contribute to the total count every single time. Consider the following sequence - 1,1, 2, 2, 3, 3, 4, 4. Here, n = 8. Under your logic, we would add some positive count to the answer with M = 9 even when there is no scenario where f(S) or M \gt 5.
Also consider the following sequence - 1, 1, 2, 2, 5, 5, 6, 6. Here, under the current logic 2 values of M i.e 3 and 4 would contribute equally to the total count. In which, the contribution of M = 4 is completely invalid because there exists no S s.t f(S) = 4.
I might be wrong. Please clarify my doubts. Thank You.

Edit: Late realisation - when M = 4, 2^{f_{3}} - 1 = 0. Same logic applies to the first scenario. So their contributions are actually 0. Kudos, really good solution.

1 Like

You are right, no value of M > 5 contribute to answer for first sequence, since for M > 5, the product for values less than M also contains 2^{f_5}-1 = 2^0-1 = 0 implying the whole product becomes zero.

@taran_1407
Since the choice of subset for each xx is independent of other xx, by fundamental counting principle, the number of ways of choosing values smaller than MM such that each value is present at least once is \displaystyle\prod_{x = 1}^{M-1} 2^{f_i}-1
x=1
∏ 2fi -1.
M−1

This statement and next to this statement I am not able to understand please take some example so that I can Understand.

This is my submission during contest - https://www.codechef.com/viewsolution/32264554

I am storing the count of each element in cnt map. And i need to break out of loop if i found a element which is not in map.
So if i use the find function to check if an element is in cnt map, it shows TLE even on sample test case.
Can anyone explain , why this is happening?

res = (res + (v*((mul[i-1]*i)%MOD))%MOD)%MOD ;

use this…

you are getting error because of % and * Operator associativity.

1 Like

what are these values 68 and 69 in setter’s solution?

look at these solutions :-
https://www.codechef.com/viewsolution/32279342
https://www.codechef.com/viewsolution/32282055

one got wrong answer and other got ac with one mmajor change which was dividing the main statement sum = sum%m + (prev%m * (po(2,umap[j])%m*j%m)%m)%m into smaller parts in second submission

2 Likes

I have done that still no change in verdict
https://www.codechef.com/viewsolution/32296438

This code works fine for ans’s that go beyond modulo so i dont know what the problem is.

PS. CODECHEF SHOULD START RELEASING TESTCASES JUST LIKE CODEFORCES ,
IT IS REALLY FRUSTRATING WHEN THE CODE BEHAVES LIKE THIS AND THERES NOT MUCH WE CAN DO

5 Likes

When you do

(a*(b*c)%m)%m


letd=b*c

(a*d % m)%m


let e=a*d

(e%m)%m;


can u help me find mistake in my code above ??
my logic is correct
It passed on subtask but is failing others

On line 314,

add this extra mod as follows :
l[i]=( l[i-1]*( powm(2,cont.ss,mod) - 1 + mod )%mod )%mod;

l[i]=( l[i-1]*( powm(2,cont.ss,mod) - 1 )%mod )%mod;

That might be a problem

Hey!I am getting right for most of the test cases.dont know where im going wrong…can someone suggest what i have to rectify in this code .
I have initialised mul to compute the set containing all previous elements and count is my frequency array

https://ide.geeksforgeeks.org/Ovuz4zVLyK
please check this link for the code and tell me what is wrong

why my solution(link) is giving WA.

I see no diff in my sol and this(link)

This piece of code is running on my system. But is giving SIGABRT error on CodeChef IDE. Can someone explain why?

Here’s a link to my solution:

https://www.codechef.com/viewsolution/32278975