You are not logged in. Please login at www.codechef.com to post your questions!

×

Most frequent substring problem.

Given a string, we want to know the maximum no. of occurrences of any substring that satisfies following two conditions:

  1. The substring's lengths is within in inclusive range of minLength to maxLength.
  2. The total no. of unique characters in the string doesn't exceed maxUnique.

For example, given a string s=abcde, minLength=2, maxLength=5, maxUnique=3, the substrings matching the criteria are (ab, bc, cd, de, abc, bcd, cde). Any shorter string fails he minLength>=2 any longer will fail maxUnique <= 3. Each of the substring occurs only one time.

INPUT : First line contains a string, second line contains minLength, third line contains maxLength, and the last line contains maxUnique.

Constraints :

  • 2<=n<=105
  • 2<=minLength<= maxLength <=26
  • maxLength<n
  • 2<=maxUnique<=26

SAMPLE INPUT :

ababab
2
3
4

SAMPLE OUTPUT :

3

EXPLANATIONS :

We want to find the no. of occurrences of the most frequently occurring substring of s= "ababab" that has the length in the inclusive range from minLength = 2 and maxLength=3 and contains maximum of maxUnique = 4 unique characters. The substring ab occurs three times aba, bab and ba occurs twice. Because we want maximum of this frequencies we return 3 as our answer.

asked 01 Sep '18, 10:43

arpit728's gravatar image

1★arpit728
6831562
accept rate: 10%

edited 01 Sep '18, 11:04


(Part 1)

Obviously we shouldn't care for maxLength,since we wanna maximize ocuurrence(as lets say if a substring of length x,occurs y times,then there must be a substring of length (x-1) that occurs>=y times)

So we just have to find max no.of occurrences of a substring of length minLength

(Part 2)

unique thing can easily be done using frequency arrays

(Part 3)

One simple way is to use rolling hashing see this: https://threads-iiith.quora.com/String-Hashing-for-competitive-programming

for counting max occurence of a substring of given length


(Part 1)

Proof: Lets say a substring s of length x occurs b times,then there will be a substring s' of length < x occuring >=b times(1 such substring would be a substring of s)

like ababa

ab occurs 2 times,so obviously a occurs>=2,b>=2

So better to find minimum Length String,as it will have less unique characters and also occurs greater number of times


(Part 2)

how to check if a substring s[i:j] is valid (no.of unique characters<=maxUnique)

lets construct a prefix sum freq array

where:

freq[i]['a']=freq[i-1]['a']+(s[i]=='a')

freq[i]['b']=freq[i-1]['b']+(s[i]=='b')

......

So for calculating no.of unique characters in s[i:j]

uniqueCount=0 for(c='a';c <= 'z';c++) if((freq[j][c]-freq[i-1][c])>0) uniqueCount++


(Part 3)

we can now iterate for all substring of length minLength

so :

for(i=0;i< s.length()-minLength;i++)
{
//so i to i+minLength-1 is substring starting at i
//first we'll check if its valid(unique characters ...),i have explained it in part 2
//if its unique,find hash value of substring s[i:i+minLength-1],which can be found quickly 
//if u use rolling hashing(read the link i provided)
int hashValue=hash(i,i+minLength-1)
map[hashValue]++;
}
Now iterate through the map to find highest frequency hash value
link

answered 01 Sep '18, 11:38

vivek_1998299's gravatar image

6★vivek_1998299
1.6k29
accept rate: 23%

edited 01 Sep '18, 11:39

@vivek_1998299

Can you please provide more detailed explanation, I am finding it hard to understand.

(01 Sep '18, 22:20) arpit7281★

OOh sorry,for the late reply!!

These bugs...

(03 Sep '18, 14:10) vivek_19982996★

@vivek_1998299

I have got the idea from the explanation and the link, can you explain me about implementing hashing. I have encountered this kind of problem for the first time that's why I am facing difficulty.

(30 Sep '18, 16:44) arpit7281★

If u had sincerely read the link ,u wouldn't have asked this..:(.

Plz read from link,and do a bit of searching too!!,i don't support spoon feeding...

(04 Oct '18, 10:21) vivek_19982996★

@vivek_1998299 In the worst case, Time complexity of your approach is O(N^2). Is there any O(N log N) algorithm for the same thing?

(20 Dec '18, 19:25) nik_845★

@vivek_1998299 In the worst case, Time complexity of your approach is O(N^2). Is there any O(N log N) algorithm for the same thing?

(20 Dec '18, 19:26) nik_845★
showing 5 of 6 show all
toggle preview
Preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported
  • mathemetical formulas in Latex between $ symbol

Question tags:

×630

question asked: 01 Sep '18, 10:43

question was seen: 4,671 times

last updated: 20 Dec '18, 19:26