MEXSTR-Editorial

PROBLEM LINK:

Practice
Div-1 Contest
Div-2 Contest
Div-3 Contest

Author: Avijeet Tiwari
Tester: Shubham Anand Jain
Editorialist: Avijeet Tiwari

DIFFICULTY:

EASY-MEDIUM

PREREQUISITES:

Binary String, Subsequence, dynamic programming, binary search

PROBLEM:

MEX of a binary string is defined as a number x such that all no. from 0 to x-1 are present as subsequence in its binary representation (without leading zeroes) in the string. Given a binary string s, Output MEX(s) (without leading zeroes) in binary format.

QUICK EXPLANATION:

Iterating over input string(S) from reverse side , we try to find dp[i] as maximum length of binary string such that all binary strings of length less than or equal to dp[i] can be obtained as subsequence in S. Considering the maximum length that can be obtained from dp[x] ,where x is position of first one in S, we intialise our answer with x+1. Now fixing 0 at current position in answer string we try to verify whether all required strings starting with 0 can be obtained or not and then restrict our solution domain accordingly.

EXPLANATION:

FOR SUBTASK 1(BRUTE FORCE):

Since size of input string is so small, we can simply iterate over all candidate binary strings, starting from st=β€œ0”,
and check one by one whether our candidate string st is present as subsequence in input string S or not. And when we reach a string st which is not present in string S, this will be our answer.
Clearly for N=10, we will have to check atmost 63 binary strings for input string(Ex 1010101010)

FOR SUBTASK 2:

Here, size of input string is of order 60, that means we can represent it via long long int.
Lets consider a vectors dp, and iterating over input string in reverse direction, we will try to calculate total no. of binary strings with most significant bit as S[i] in dp[i+1] for the substring from S[i+1:n-1].
Now for the ith position, we will find nearest position one (as n1) and nearest position zero (as n0) to right side of ith position. Considering values calculated at these positions we will try to update dp[i].
Clearly ,
if dp[n0]==dp[n1], then considering ith position as MSB, we can get all values as substring from 0 to (dp[n0]+dp[n1]-1), so dp[i]=dp[n0]+dp[n1].
otherwise dp[i]=min(dp[n0],dp[n1])*2;
NOTE: When a binary subsequence can be obtained from substring S[j:n-1] , it can also be obtained from S[i:n-1], where i<=j;

Now, considering position most significant one in input string( let’s say n1_0), we know that all no. from 0 to dp[n1_0]-1 can be obtained from string S. So we will initialise our answer (let’s say calc=dp[n1_0]) that means our answer lies above or equal to dp[n1_0].
Now from current position(let’s say cp initialised with n1_0) , we will find nearest zero (as nz) and nearest one (as no),
If dp[nz]==dp[i], then it means all strings starting with 0 from now on can be obtained as subsequence (i. e. all no. from calc to calc+dp[nz]-1 can be obtained now)
so add dp[nz] to calc and move cp to nearest one fixing current bit of our answer string as β€˜1’,
otherwise
all no. from calc to calc+dp[nz]-1 cannot be obtained in the string, so move cp to nearest zero.
Finally after coming out of the loop , we will have the value calc as the first number whose binary respresentation is not present as subsequence in input string

Original Constraints:

Here size of string is of order 1e5, which cannot be accomodated in integer format.
In continuation with approach used above for subtask 2, for this constraint we will try to store in dp[i], the maximum length of binary string such that all binary strings starting with S[i] and of size less than or equal to dp[i] can be obtained from string S[i:n-1].
Here transition will be
if dp[nz] ==dp[n1], then dp[i]=dp[nz]+1.
else dp[i]=min(dp[nz],dp[n1])+1.
Now we will try to build our answer string directly.
Considering position of most significant one as n1_0, we will create our answer string as calc of size dp[n1_0]+1.
Fixing β€˜1’ for calc[0] and cp to n1_0,
we find nearest one and nearest zero,
First we will try to fix β€˜0’ at current position and try to check if all variation with current prefix can be found or not(i.e. dp[nz]==no. of vacant spaces in calc( including currently fixed β€˜0’).
If yes, we fix β€˜1’ at current position and update cp to nearest one(n1).
else update our cp to nearest zero(n0).
Note: If cp goes out of array length then all remaining values of calc will be filled with 0.

string calc will be output for our input string.

Time Complexity: O(n).
Space Complexity: O(n).

SOLUTIONS:

Setter's Solution
#include<bits/stdc++.h>
//#include<tr1/unordered_map>
#define ll long long
#define mod 1000000007
#define mp make_pair
#define pi pair<ll,ll>
#define pb push_back
#define endl '\n'
#define ff first
#define ss second
#define print cout<<"Case #"<<ii<<": "
#define sync ios_base::sync_with_stdio(false); cin.tie(NULL);cout.tie(NULL)
using namespace std;
//using namespace std::tr1;
//011001110011
int main()
{
sync;
#ifndef ONLINE_JUDGE
		freopen("Tests2/5.in","rb",stdin);
		//freopen("Tests2/7.ou","wb",stdout);
#endif
ll t=1,i,j,k;
cin>>t;
for(ll ii=1;ii<=t;ii++)
{
	string s;
	cin>>s;
	ll n=s.size();
	vector<pi> nr(n,mp(n,n));
	vector<ll> dp(n+1,0);
	//preparing array for nearest zero and nearest
	//one after ith position
	for(i=n-2;i>=0;i--)
	{
		if(s[i+1]=='0')
			nr[i]=mp(i+1,nr[i+1].ss);
		else
			nr[i]=mp(nr[i+1].ff,i+1);
	}
	/*vector<ll> ans(1,0);   // solution for subtask 1
	while(1)
	{
		ll sp=0,f=0;
		if((s[0]-'0')!=ans.back())
		{
			if(ans.back()==0)
				sp=nr[sp].ff;
			else
				sp=nr[sp].ss;
		}
		//cout<<sp<<endl;
		if(sp>=n)
		{
			f=1;
			break;
		}

		for(i=(ll)ans.size()-2;i>=0;i--)
		{
			if(ans[i]==0)
				sp=nr[sp].ff;
			else
				sp=nr[sp].ss;
			//cout<<i<<" "<<sp<<endl;
			if(sp==n)
			{
				f=1;
				break;
			}
		}
		//cout<<f<<endl;
		if(f==1)
			break;
		ans[0]++;
		i=0;
		while(i<ans.size() && ans[i]>1)
		{
			ans[i]-=2;
			if((i+1)==ans.size())
				ans.pb(1);
			else
				ans[i+1]++;
			i++;
		}

	}
	for(i=ans.size()-1;i>=0;i--)
		cout<<ans[i];

	cout<<endl;*/
	//solution for subtask 2
	//if string contain all ones
	if(nr[0].ff>=n && s[0]!='0')
	{
		cout<<"0"<<endl;
		continue;
	}
	//if string contain all zeroes
	if(nr[0].ss>=n && s[0]!='1')
	{
		cout<<"1"<<endl;
		continue;
	}
	// preparing a dp array where dp[i] is equal to maximum x
	// such that every no. starting with s[i] from 0 to 2^x-1 can be formed
	// from substring s[i:]
	for(i=n-1;i>=0;i--)
	{
		dp[i]=min(dp[nr[i].ff],dp[nr[i].ss])+1;			
	}

	i=0;
	if(s[i]!='1') //pointing iterator i to most significant 1 in string s.
		i=nr[i].ss;
	//nb stands for number of bits in our answer sequence
	ll nb=dp[i]+1;
	vector<ll> ans(nb,0);
	ans[0]=1;
	ll nz; // iterator for nearest zero
	for(j=1;j<ans.size();j++)
	{
		nz=nr[i].ff;// fixing ans[j]=0
		if(nz>=n)
		{
			ans[j]=0;
			i=nz;
			continue;
		}
		k=min(dp[nr[nz].ff],dp[nr[nz].ss]);
		if(k==(nb-j-1)) //checking if all required no. starting with 0 can be formed or not
		{
			ans[j]=1;
			i=nr[i].ss;
		}
		else
		{
			ans[j]=0;
			i=nr[i].ff;
		}

	}
	for(i=0;i<ans.size();i++)
		cout<<ans[i];
	cout<<endl;
}
return 0;
}
Tester's Solution
    //By TheOneYouWant
#pragma GCC optimize ("-O2")
#include <bits/stdc++.h>
using namespace std;
#define fastio ios_base::sync_with_stdio(0);cin.tie(0)
#define pb push_back
#define mp make_pair
#define fi first
#define se second
#define all(x) x.begin(),x.end()
#define memreset(a) memset(a,0,sizeof(a))
#define testcase(t) int t;cin>>t;while(t--)
#define forstl(i,v) for(auto &i: v)
#define forn(i,e) for(int i=0;i<e;++i)
#define forsn(i,s,e) for(int i=s;i<e;++i)
#define rforn(i,s) for(int i=s;i>=0;--i)
#define rforsn(i,s,e) for(int i=s;i>=e;--i)
#define bitcount(a) __builtin_popcount(a) // set bits  (add ll)
#define ln '\n'
typedef long long ll;
typedef unsigned long long ull;
typedef long double ld;
typedef pair<ll,ll> p64;
typedef pair<int,int> p32;
typedef pair<int,p32> p96;
typedef vector<ll> v64;
typedef vector<int> v32; 
typedef vector<v32> vv32;
typedef vector<v64> vv64;
typedef vector<p32> vp32;
typedef vector<p64> vp64;
typedef vector<vp32> vvp32;
typedef map<int,int> m32;
const int LIM=2e5+5,MOD=1e9+7;
const ld EPS = 1e-9;

int read()
{
	int xx=0,ff=1;char ch=getchar();
	while(ch>'9'||ch<'0'){if(ch=='-')ff=-1;ch=getchar();}
	while(ch>='0'&&ch<='9'){xx=(xx<<3)+(xx<<1)+ch-'0';ch=getchar();}
	return xx*ff;
}

mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());

long long readInt(long long l, long long r, char endd) 
{
	long long x=0;
	int cnt=0;
	int fi=-1;
	bool is_neg=false;
	while(true) 
	{
		char g=getchar();
		if(g=='-') 
		{
			assert(fi==-1);
			is_neg=true;
			continue;
		}
		if('0'<=g&&g<='9')
	 	{
			x*=10;
			x+=g-'0';
			if(cnt==0) 
			{
				fi=g-'0';
			}
			cnt++;
			assert(fi!=0 || cnt==1);
			assert(fi!=0 || is_neg==false);

			assert(!(cnt>19 || ( cnt==19 && fi>1) ));
		} 
		else if(g==endd) 
		{
			if(is_neg) 
			{
				x=-x;
			}
			assert(l<=x&&x<=r);
			return x;
		} 
		else 
		{
			assert(false);
		}
	}
}
string readString(int l, int r, char endd) 
{
	string ret="";
	int cnt=0;
	while(true)
	{
		char g=getchar();
		assert(g!=-1);
		if(g==endd) 
		{
			break;
		}
		cnt++;
		ret+=g;
	}
	assert(l<=cnt&&cnt<=r);
	return ret;
}
long long readIntSp(long long l, long long r) 
{
	return readInt(l,r,' ');
}
long long readIntLn(long long l, long long r) 
{
	return readInt(l,r,'\n');
}
string readStringLn(int l, int r) 
{
	return readString(l,r,'\n');
}
string readStringSp(int l, int r) 
{
	return readString(l,r,' ');
}

int main()
{
	fastio;

	int t = readIntLn(1, 100'000);
	// int t;
	// cin>>t;
	ll sum_s = 0;

	while(t--)
	{
		// string s;
		// cin>>s;
		string s = readStringLn(1, 1'000'000);
		int n = s.length();
		sum_s += n;
		int len[n] = {0};
		// at position i, number of numbers following with j length
		ll nxt[n+1][2];
		forn(i,n+1)
		{
			forn(j,2)
			{
				nxt[i][j] = MOD;
			}
		}
		ll n_z = MOD;
		ll n_o = MOD;
		rforn(i,n-1)
		{
			nxt[i][0] = n_z;
			nxt[i][1] = n_o;
			if(s[i] == '0') n_z = i;
			else n_o = i;
		}

		if(n_z == MOD)
		{
			cout<<0<<ln;
			continue;
		}
		if(n_o == MOD)
		{
			cout<<1<<ln;
			continue;
		}
		// stored next closest 0 and 1
		// now calculate the dp

		rforn(i,n-1)
		{
			len[i] = 0;
			int no = nxt[i][1];
			int nz = nxt[i][0];
			if(nz < MOD && no < MOD)
			{
				len[i] = 1 + min(len[nz], len[no]);
			}
		}
		// start with n_o
		int curr = n_o;
		string ans = "";
		while(true)
		{
			ans += s[curr];

			int no = nxt[curr][1];
			int nz = nxt[curr][0];

			if(nz == MOD)
			{
				ans += '0';
				break;
			}
			if(no == MOD)
			{
				ans += '1';
				break;
			}
			if(len[nz] <= len[no])
			{
				curr = nz;
			}
			else
			{
				curr = no;
			}
		}
		cout<<ans<<ln;
	}
	assert(sum_s <= 2'000'000);
	assert(getchar()==EOF);

	return 0;
}
3 Likes

Why is the solution so badly formatted ?

18 Likes

Can you explain where vacant spaces are coming from in calc?

Can there be some easy video explanation for this?

Firstly We found the length of our answer string. Then While calculating answer string , we try to fix some initial part of string and check whether we can obtain all strings with our predefined prefix can be obtained or not.
No. of vacant spaces in calc denotes total remaining bits in our answer string that has not yet been fixed including the current bit which we are trying to fix.
While building our solution we try to fix β€˜0’ at current position and check if we can obtain all binary strings with this prefix or not.
If yes, the our answer string has β€˜1’ at current prefix.

2 Likes

Can anyone tell me what is wrong with my code?

Code

I really enjoyed the problem, Thanks for setting it. It was not easy to come up with the construction and it thoroughly amazed me when I was able to do it.

1 Like

The binary representation of 6 is 110 which is also not a subsequence of the given input. Then why is the answer bin(12)? it should have been bin(6). Please explain.

in given sample testcase
1001011
here β€œ110” is present thats why ans is β€œ1100” i.e 12
remember q is asking for subsequence not substring.

Any easy and understandable solutions using DP approach to this problem ?

I wrote the same logic in Python but it just wont accept the answer!!

Here’s the code:

T=int(input())

while T:

T-=1

S=list(input())

length=[0]*len(S)

c=len(S)

i=c-1

cnt=0

if '1' not in set(S):

    print('1')

    continue

if '0' not in set(S):

    print('0')

    continue

zero=False

ones=False

while i>= 0:

    length[i]=cnt

    if S[i]=='1':

        ones=True

    if S[i]=='0':

        zero=True

    if ones and zero:

        cnt+=1

        ones=zero=False

    i-=1

# print(length)

n1=n0=10**9

next0=[0]*c

next1=[0]*c

i=c-1

while i>=0:

    next0[i]=n0

    next1[i]=n1

    if S[i]=='1':

        n1=i

    else:

        n0=i

    i-=1

# print(length,next1,next0)

ans=[]

c=len(S)

cur_pos=0

while True:

    ans.append(S[cur_pos])

    p1=next1[cur_pos]

    p0=next0[cur_pos]

    if p0==10**9:

        ans.append('0')

        break

    if p1==10**9:

        ans.append('1')

        break

    if length[p0]<=length[p1]:

        cur_pos=p0

    else:

        cur_pos=p1

for i in ans:

    print(i,end='')

print()