RICHSTR - EDITORIAL

PROBLEM LINK:

Practice

Contest: Division 1

Contest: Division 2

Setter: Erfan Alimohammadi, Rami

Tester: Teja Vardhan Reddy

Editorialist: Taranpreet Singh

DIFFICULTY:

Easy

PREREQUISITES:

Observation, RMQ or Segment Tree.

PROBLEM:

Given a string S of length N, you have to answer Q queries of the following form.

For each query, given two integers l and r, determine whether there exists any substring of String S_l, S_{l+1}, \ldots S_r, which is dominated by any character c. A string is dominated by a character c if c occur strictly more than half the length of string.

EXPLANATION

This editorial would be mostly observation, with little to code.

First of all, let’s assume S[l, r] denote substring S_l, S_{l+1}, \ldots S_r.

Let’s assume substring T with length x is dominated by some character c. WLOG assume x is even, this can be extended to odd x too. Let’s split this substring into two equal parts of length x/2 each, let’s call them left and right part.

Assume both left and right part are not dominated. Then, the maximum number of times c can occur in both strings is at most x/4 each, hence, we can have at most x/4+x/4 = x/2 occurrences of c at most, which implies that string T is not dominated by any character.

But this contradicts our assumption that T is dominated by character c.

This leads us to the conclusion that if a substring S is dominated, either left or the right half of the string is also dominated.

Hence, if we consider all substrings of S[l, r] with length >= 6, we can always have a string, it just suffices to check whether left or right half of string is dominated or not.

For strings of length 4 and 5, we can check whether they contain any substring of length 3 which is dominated or not as there cannot exist any substring of length 4 or 5 which is dominated, but no substring of such string of length 3 is dominated. (Similar proof as above). This cases arose because problem imposes the restriction on the minimum length of the dominated string.

Hence, to conclude from above, if a dominated substring T of length x> 3 exists, there also exists a substring with length 3 as a substring of T which is dominated.

So, all we need to do is consider each start point p and check whether substring S[p, p+2] is dominated or not. Let’s precompute this.

Let’s answer queries now. For each query, we are concerned for substring S[l, r]. If the length of this substring is less than 3, the answer is NO directly. Otherwise, we can consider all start points in the range [l, r-2] and if any string is dominated, substring S contains a dominated string, otherwise, no dominated string exists in this substring.

Noticing carefully, we need a range data structure, which can determine the range OR of range [l, r]. Any RMQ data structure such as sparse Table and Segment Tree would work.

Bonus What if there are updates, each update changing a single character at a specified position to c.

TIME COMPLEXITY

Time complexity is O((N+Q)*log(N)) per test case.

SOLUTIONS:

Setter 1 Solution
#include <bits/stdc++.h>
using namespace std;
 
 
const int max_n = 100100;
 
int ans[max_n];
int ln[max_n];
 
 
 
int main()
{
	ios_base::sync_with_stdio(false);
	cin.tie(0);
	int t;
	cin >> t;
	int cnt = 0;
	while(cnt < t)
	{
		int n,q;
		cin >> n >> q;
		string s;
		cin >> s;
		ans[s.length()-1] = s.length();
		for(int i=s.length()-2;i>=0;i--)
		{
			if(s[i] == s[i+1])
				ans[i] = i , ln[i] = 1;
			else if(i != s.length() - 2 && s[i] == s[i+2])
				ans[i] = i , ln[i] = 2;
			else ans[i] = ans[i+1] , ln[i] = ln[i+1]; 
 
		}
		for(int i=0;i<q;i++)
		{
			int l , r;
			cin >> l >> r;
			l-- , r--;
			if(r - l + 1 >= 3 && ans[l] + ln[l] <= r)
				cout <<"YES" << "\n";
			else 
				cout <<"NO" << "\n";
		}
		cnt++;
	}
	return 0;
}
Setter 2 Solution
#include <bits/stdc++.h>
#define ll long long
using namespace std;

char s[1001000];
int ok1[1001000];
int ok2[1001000];

int main() {
	int t;
	cin>>t;
	int n,q;
	while(t--){
	    scanf("%d%d",&n,&q);
	    scanf("%s",s+1);
	    for(int i=0 ;i <=n+2; i++)ok1[i] = ok2[i] = 0;
	    for(int i=1 ;i <=n ;i ++){
	        ok1[i] = ok1[i-1];
	        if(i+1 <= n && s[i] == s[i+1])ok1[i]++;
	    }

	    for(int i=1 ;i <=n ;i ++){
	        ok2[i] = ok2[i-1];
	        if(i+2 <= n && s[i] == s[i+2])ok2[i]++;
	    }
	    int l,r;
	    while(q--){
	        scanf("%d%d",&l,&r);
	        if(r-l+1 < 3){
	            printf("NO\n");
	            continue;
	        }
	        if(ok1[r-1]-ok1[l-1] > 0)printf("YES\n");
	        else if(ok2[r-2]-ok2[l-1] > 0)printf("YES\n");
	        else printf("NO\n");
	    }
	}
	return 0;
}
Tester's Solution
//teja349
#include <bits/stdc++.h>
#include <vector>
#include <set>
#include <map>
#include <string>
#include <cstdio>
#include <cstdlib>
#include <climits>
#include <utility>
#include <algorithm>
#include <cmath>
#include <queue>
#include <stack>
#include <iomanip>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp> 
//setbase - cout << setbase (16); cout << 100 << endl; Prints 64
//setfill -   cout << setfill ('x') << setw (5); cout << 77 << endl; prints xxx77
//setprecision - cout << setprecision (14) << f << endl; Prints x.xxxx
//cout.precision(x)  cout<<fixed<<val;  // prints x digits after decimal in val

using namespace std;
using namespace __gnu_pbds;

#define f(i,a,b) for(i=a;i<b;i++)
#define rep(i,n) f(i,0,n)
#define fd(i,a,b) for(i=a;i>=b;i--)
#define pb push_back
#define mp make_pair
#define vi vector< int >
#define vl vector< ll >
#define ss second
#define ff first
#define ll long long
#define pii pair< int,int >
#define pll pair< ll,ll >
#define sz(a) a.size()
#define inf (1000*1000*1000+5)
#define all(a) a.begin(),a.end()
#define tri pair<int,pii>
#define vii vector<pii>
#define vll vector<pll>
#define viii vector<tri>
#define mod (1000*1000*1000+7)
#define pqueue priority_queue< int >
#define pdqueue priority_queue< int,vi ,greater< int > >
#define flush fflush(stdout) 
#define primeDEN 727999983
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
#define endl '\n'

// find_by_order()  // order_of_key
typedef tree<
int,
null_type,
less<int>,
rb_tree_tag,
tree_order_statistics_node_update>
ordered_set;

int good[123456];

int main(){
	std::ios::sync_with_stdio(false); cin.tie(NULL);
	int t;
	cin>>t;
	while(t--){
		int n,q;
		cin>>n>>q;
		int i;
		string s;
		cin>>s;
		rep(i,n+10){
			good[i]=0;
		}
		rep(i,n-1){
			if(s[i]==s[i+1])
				good[i+1]=1;
		}
		rep(i,n-2){
			if(s[i]==s[i+2])
				good[i+1]=1;
		}
		f(i,1,n+1){
			good[i]+=good[i-1];
		}
		int l,r;
		rep(i,q){
			cin>>l>>r;
			if(l+1>=r){
				cout<<"NO"<<endl;
				continue;
			}
			if(s[r-1]==s[r-2]){
				cout<<"YES"<<endl;
				continue;
			}
			if(good[r-2]>good[l-1]){
				cout<<"YES"<<endl;
				continue;
			}
			cout<<"NO"<<endl;
		}
	}
	return 0;   
}
Editorialist's Solution
import java.util.*;
import java.io.*;
import java.text.*;
class RICHSTR{
	//SOLUTION BEGIN
	void pre() throws Exception{}
	void solve(int TC) throws Exception{
	    int n = ni(), q = ni();
	    String s = n();
	    int m = 1;
	    while(m<=n)m<<=1;
	    boolean[] b = new boolean[m<<1];
	    for(int i = 0; i< n-2; i++)
	        if(s.charAt(i) == s.charAt(i+1) || s.charAt(i+1) == s.charAt(i+2) || s.charAt(i) == s.charAt(i+2))b[i+m] = true;
	    
	    for(int i = m-1; i> 0; i--)b[i] = b[i<<1]|b[i<<1|1];
	    while(q-->0){
	        int l = ni()-1, r = ni()-3;
	        pn(query(b, l, r, 0, m-1, 1)?"YES":"NO");
	    }
	}
	//Returns OR of all booleans in range[l, r]
	boolean query(boolean[] b, int l, int r, int ll, int rr, int i){
	    if(l > r)return false;
	    if(l == ll && r == rr)return b[i];
	    int mid = (ll+rr)/2;
	    if(r <= mid)return query(b, l, r, ll, mid, i<<1);
	    else if(l > mid)return query(b, l, r, mid+1, rr, i<<1|1);
	    else return query(b, l, mid, ll, mid, i<<1) || query(b, mid+1, r, mid+1, rr, i<<1|1);
	}
	//SOLUTION END
	void hold(boolean b)throws Exception{if(!b)throw new Exception("Hold right there, Sparky!");}
	DecimalFormat df = new DecimalFormat("0.00000000000");
	static boolean multipleTC = true;
	static double eps = 1e-8;
	FastReader in;PrintWriter out;
	void run() throws Exception{
	    in = new FastReader();
	    out = new PrintWriter(System.out);
	    //Solution Credits: Taranpreet Singh
	    int T = (multipleTC)?ni():1;
	    pre();for(int t = 1; t<= T; t++)solve(t);
	    out.flush();
	    out.close();
	}
	public static void main(String[] args) throws Exception{
	    new RICHSTR().run();
	}
	int bit(long n){return (n==0)?0:(1+bit(n&(n-1)));}
	void p(Object o){out.print(o);}
	void pn(Object o){out.println(o);}
	void pni(Object o){out.println(o);out.flush();}
	String n()throws Exception{return in.next();}
	String nln()throws Exception{return in.nextLine();}
	int ni()throws Exception{return Integer.parseInt(in.next());}
	long nl()throws Exception{return Long.parseLong(in.next());}
	double nd()throws Exception{return Double.parseDouble(in.next());}

	class FastReader{
	    BufferedReader br;
	    StringTokenizer st;
	    public FastReader(){
	        br = new BufferedReader(new InputStreamReader(System.in));
	    }

	    public FastReader(String s) throws Exception{
	        br = new BufferedReader(new FileReader(s));
	    }

	    String next() throws Exception{
	        while (st == null || !st.hasMoreElements()){
	            try{
	                st = new StringTokenizer(br.readLine());
	            }catch (IOException  e){
	                throw new Exception(e.toString());
	            }
	        }
	        return st.nextToken();
	    }

	    String nextLine() throws Exception{
	        String str = "";
	        try{   
	            str = br.readLine();
	        }catch (IOException e){
	            throw new Exception(e.toString());
	        }  
	        return str;
	    }
	}
}

Feel free to Share your approach, if you want to. (even if its same :stuck_out_tongue: ) . Suggestions are welcomed as always had been. :slight_smile:

6 Likes

If we precompute a boolean array having 1 if the substring of length 3 starting at position i is dominated by a character and 0 otherwise, we can compute prefix sums on this array and answer queries in O(1) time, eliminating the \mathbb{log}N factor.

14 Likes

Though I understood the solution, I think using a RMQ DS is overkill. you can solve the question by simply maintaining a Prefix Sum Array.

11 Likes

can you tell me why i am getting runtime error??
#include <bits/stdc++.h>
using namespace std;

struct range{
int start;
int end;
};

bool check(range *arr,int l,int r,int size_range){

for(int i=0;i<size_range;i++){
	if(l<=arr[i].start && arr[i].end<=r){
		return true;
	}
}
return false;

}
int main()
{
int t;
cin>>t;
while(t–){
int n,q,i,j,t;
string s;
cin>>n>>q;
cin>>s;
range arr[n];
for(i=0;i<n;i++){
arr[i].start = -1;
arr[i].end = -1;
}
t=0;

    for(i=0;i<s.size()-2;i++){
        if(s[i]==s[i+1] || s[i+1]==s[i+2] || s[i]==s[i+2]){
          arr[t].start = i;
          arr[t].end = i+2;
          t++;
		}
		
    }
    int size_range = t+1;
    while(q--){
        int l,r;
        cin>>l>>r;
        if(check(arr,l-1,r-1,size_range)==true)
        cout<<"YES"<<endl;
        else
        cout<<"NO"<<endl;
        
    }
    
}

}

heres my failed approach, i hope someone points out where i am wrong.

so i observed that if in the range l…r if there is even 1 length 3 rich string then l…r answer is YES,okay

so what i do is

1.) store the starting index of every 3 length rich string.
2.) for every query of length l…r i binary search l in the array formed in step 1 , then if i find it, i check if the index (say x) lies completely between l…r i.e. x>=l and x+2<=r if so, yes l…r is rich, else no l…r is not rich,

but this approach gave me wrong answer… can someone point out what’s wrong in this approach

my code : https://www.codechef.com/viewsolution/26007939

2 Likes

if a string is rich then it’s every three consecutive characters will form a rich substring.
by prefix sum it can be calculated in O(n) time complexity.

3 Likes

I used the same approach and got accepted. You must have made some implementation mistake. My submission: https://www.codechef.com/viewsolution/25992891

https://www.codechef.com/viewsolution/26017104

Can anyone tell why my dynamic programming approach is wrong ?
I have created a 2D array of dimension = string.length. I have started filling out the 2D array upwards of the principal diagonal. Each cell in the upper triangle represents a particular substring. For substrings of size 1 and 2 the cell holds value false. For substrings of size 3 I have checked if a particular character is repeated twice in the substring. For substrings of all other sizes I check if table[i][j-1] == true or table[i+1][j] == true then table[i][j]==true.

For once keeping aside the complexity of my solution plz help why it is not corect?
if we make an occurence array v[26][N] where v[j][i]=1 if s[i]=jth character otherwise 0. Now for every l,r the objective is to find in any v[j] Sum of any of its subarray>L i.e. for any j \sum_{i=m}^{i=n} v[j][i]>L(length of subarray i.e n-m+1)/2 where l<=m; m+1<n;n<=r. We can write the summation as \sum_{i=m}^{i=n} (2*v[j][i]-1)>0. Thus in converted array v[j]–> (2v[j]-1) we have to find a positive subarray. It can be found through a linear pass like in Kadane’s algorithm we’ll discard the negative sum and take only postive. If we found sum>0 at any point then “YES” otherwise “NO”.
Complexity wise I know it won’t work coz O(n
q) but I get WA instead of TLE in it.

There’s a more rigorous proof showing that a string is rich iff it contains a rich substring of length 3.

To prove it in one direction, assume a string s of length n is rich. Then there exists some character c which occurs in s strictly more than n/2 times. If n is even, dissect s into n/2 substrings of length 2 (so a string abaacada would be dissected into ab, aa, ca and da.) By the pigeonhole principle, one of the substrings would have both characters as c. Take this substring along with an immediately preceding or succeeding character ( which always exists because n>=3 ) to get a substring of length 3 which is rich. If n is odd, dissect the first n-1 characters into substrings of length 2, as explained before, and keep the last element aside( abaad becomes ab, aa and d ). Apply the pigeonhole principle to arrive at the conclusion that either one of the two letter substrings contains both characters as c ( and proceed as before to get a rich substring of length 3), or d and the last two letter substring will form a rich substring of length 3.

It is proved in the other direction by using the definition of a rich substring.

3 Likes

Thanks, i found my mistake… i was applying upper bound for search key L whereas we had to apply upper bound for search key L-1.

Here a similar but more superior question: Superior Substring

finding a substring of length 3 with dominating character works only when length(substr_query)<=5 . for substrings >=6 you have to check for left and right part of that string . As mentioned in the editorial

SegmentTree? :joy:
Tried to implement but came up with some logic

1 Like

Can someone tell me what is wrong in my code?
https://www.codechef.com/viewsolution/26018568

My approach:
for every index, check if it is rich string (index[i]=2 if s[i]==s[i+1], else index[i]=3 (s[i]==s[i+2])
Then for every other ‘i’ which is not rich, find the first index which is rich and change it accordingly. (what I mean is lets say index[i]=3, then if j<i and j is not rich, then index[j] = i-j + index[i]).
Then for each query, I check whether (l+index[l]<=r)

https://www.codechef.com/viewsolution/26013758

can someone help me figure out, what is going wrong here?

https://www.codechef.com/COOK109B/problems/RICHSTR/

I wrote code for the same logic but I precomputed the minimum “r” for each “l” in an array. But strangely I am getting time limit exceeded error.
Can some0ne see the solution and explain where I am going wrong?

My solution only using an array.link

Can anyone explain the segment tree approach?

2 Likes

can you explain your approach by using an example …please