×

# SBSTR - Editorial

Tester- Jakub Safin
Editorialist- Abhishek Pandey

SIMPLE

# PRE-REQUISITES:

Frequency Array concept's application on string, looping, logical reasoning

# PROBLEM:

A string is said interesting if, it has at least $K$ distinct characters and frequency of all those characters is same. Given a string $S$, find number of interesting sub-strings in it.

# QUICK EXPLANATION:

If frequency of all characters is same, it goes without saying that the value of maximum frequency is also same as value of any other frequency. Hence, we can simply generate all sub-strings in $O({N}^{2})$ and check the conditions in $O(1)$ by maintaining a $cnt$ variable to check number of distinct characters in the sub-string, along with using $MaxFreq*cnt==length$ $of$ $string$ to check if frequencies of all characters are same. (Note that $cnt$ is storing the number of distinct characters in the string).

# EXPLANATION:

This is a fairly straightforward and simple problem. The editorial will have a single section, which will describe the two approaches used by setter and tester.

1. Full Solution-

By the constraints, it is clear that an $O({S}^{2})$ can pass, provided we dont indulge in very expensive operations.

Let me first put up the main code for reference and then refer to it for explanation. Also, let me use $S(i,j)$ to denote "Substring of $S$ starting at $i$ and ending at $j$"

for(int i = 0; i < N; i++) {
int occ[26] = {};
int mxocc = 0, cnt = 0;
for(int j = i; j >= 0; j--) {
if(occ[S[j]-'a'] == 0) cnt++;
mxocc = max(mxocc, ++occ[S[j]-'a']);
if(cnt >= K && mxocc*cnt == i-j+1) ans++;
}
}


What we are doing is, we are fixing the value of $i$, i.e. end point of the sub-string. From there, we traverse backwards, cumulatively storing information and checking conditions for the $S(i,j)$ and updating the answer.

To those who were not able to solve the question, a word on cumulativeness is given below. (Please note, we will be discussing a general example, which has no relation to code above)

View Content

The above solution works in $O({N}^{2})$. There is another alternative solution, which will work in $O(26*{N}^{2})$. However, the constant factor of $26$ is a bit high for constraints of $|S|=7000$, hence certain constant optimizations will be needed.

Let me denote "Frequency of characters in $S(i,j)$" as $Freq(S(i,j))$ .This approach will use the fact that,-

$Freq(S(i,j+1))=Freq(S(i,j))+Freq(S(j,j))$

We already have $Freq(S(i,j))$. Again , instead of re-computing things for entire sub-string again, use the we simply update the previous values of frequency array. Then we check if there are at least $K$ distinct characters in string or not. IF AND ONLY IF there are, then we check the frequencies of characters using frequency array. Note that skipping this optimization can lead to TLE, as we would waste $26$ iterations checking the next condition for nothing.

We will loop through all characters of alphabet, and check for their frequencies. The allowed frequencies are $0$ i.e. character not resent in string (recall we have already checked that at least $K$ distinct characters are present), or if the character is present should be some value $P$ which should also be the frequency of all other characters in the string.

This solution actually runs in $O(A*{N}^{2})$ where $A$ is size of your alphabet/number of characters allowed in alphabet. As given in question, size of alphabet can be up to $26$, hence this approach works.

As an homework, try to-

• The first approach (whose code I posted) traverses substring from $S(j,i)$ to $S(j-1,i)$...$S(0,i)$. Make a solution traversing it from $S(i,j)$ to $S(i,j+1)$ . In other words, the code traverses substring in a right to left fashion, you should do in opposite direction.
• Write/Implement the code for Second approach. Chef Vijju's corner has the solution if you're stuck.

# SOLUTION:

For immediate availability of setter and tester's solution, they are also pasted in the tabs below. This is for your reference, and you can copy code from there to wherever you are comfortable reading them. You now dont have to wait for @admin to link solutions :)

View Content
View Content

Editorialist's Solution will be put up on demand.

$Time$ $Complexity=$ $O({N}^{2})$ or $O(\alpha* {N}^{2})$ where $\alpha$ is number of characters allowed in alphabet/language.

# CHEF VIJJU'S CORNER :D

1. Solution for 2nd Approach-

View Content

2. Can we not do better than $O({N}^{2})$ for this problem? Why/Why not?

3. Upto what value of $S$ (length of string), will $O({N}^{2})$ pass? Is $2*{10}^{4}$ a good estimate? Try to explore the mathematics behind it.

4. Tester's Notes-

View Content

5. A very good problem on topic of frequency array is this one. Also, its editorial (unfortunately for you, again by me xD) has more problems on same topic.

This question is marked "community wiki".

15.4k12066
accept rate: 18%

19.8k350498541

(26 May '18, 23:53)

 1 CHEF VIJJU'S CORNER:- ans:-2 There are O(n^2) substrings in a string and we have to check every substring. So I think we cannot do better than O(n^2) ans:-3 A computer does 10^8 computation/second on an average. So 2*10^4 is seems to good estimation, Because of n^2=10^8 ==> n=10^4 @vijju123 am I right? answered 27 May '18, 12:37 21●3 accept rate: 0% Ans2 - The reasoning is not correct. I can use the same reasoning to claim that "For sorting the numbers, we know that there are $N!$ permutations. We need to try each and every one of them to see if that permutation is sorted or not. Hence, sorting takes $O(N!)$ time." Ans 3- Yes, but $2*{10}^{4}$ isnt a very good estimate for TL of 1 second, as depending on implementations, it can take 0.4 - 4 seconds. (27 May '18, 15:16) Ans2 - I mean we have to check every substring if it is interesting or not. There are O(n^2) substrings, so checking every substring will take at least O(n^2) time. Is there any better way to do this in less than O(n^2)? Ans3 - In what situation string size 2∗10^4 will take 4 seconds? (28 May '18, 13:15) For Ans2- No, you may not need to check all substrings. Perhaps its possible from data of sub-strings processed till now that you can say for sure that "We dont need to check next $K$ sub-strings as as they can-never-be/will-always--be a part of answer". Like, read about centroid decomposition. It answers queries in $O(NLogN)$, and not $O({N}^{2})$ as it does not have to check all paths. Ans3- If operations involved are very cheap (eg-bitwise operations), then ${10}^{9}\approx 1sec$. Other end is, if very expensive operations are used very frequently. $4sec$ is a loose upper-bound. (28 May '18, 21:22) Thank you @vijju123 (30 May '18, 10:37)
 0 In this problem some solution with complexity O(nn26) also passed. link https://www.codechef.com/viewsolution/18669296 and this https://www.codechef.com/viewsolution/18660196. answered 26 May '18, 23:10 76●5 accept rate: 0% 1 Yes, I know. I described it in editorial, however, some other implementations (which used more expensive computations, or didnt optimize) failed as well. (26 May '18, 23:12) @vijju123 can u provide quick brief explanation for this:- http://www.spoj.com/problems/MMASS/ plzz (26 May '18, 23:13) code_man3★ Well that should not have happened. Why there is always some problem in codechef contest with weak test cases? (26 May '18, 23:16) 1 I wil @code_man , but first check the answer by @l_returns on it. I think it satisfies what you want. @akash_321 - I disagree there. I feel since setter and tester's solution themselves are $O({N}^{2})$, they shouldnt have even tried to stop the $O(26*{N}^{2})$ ones. When $2$ solutions are asymtotically similar, then even constant and random optimizations matter. (26 May '18, 23:29) @vijju123 then what is point of n<=7000 why not n<=1000 , just an extra break statement? (26 May '18, 23:42) The setter and tester did not intend that either, but they cannot indefinitely raise constraints or reduce TL, else some correct solutions will not pass. Ultimately they decided that, for second easiest problem of ltime, its ok if a few brute forces (optimized ones) pass. The tester himself made us aware by making such a solution. Its not that they were unaware of this, but its just that no correct solution should get TLE (26 May '18, 23:51) showing 5 of 6 show all
 0 Why is this code leading to tle....(my code is easy to be read).... https://ideone.com/e.js/DugpK4 answered 26 May '18, 23:17 11●2 accept rate: 0% Your solution is $O({N}^{2} LogN)$ in my opinion, which is worse than $O(26*{N}^{2})$. Map data structure has a $LogN$ factor involved. (26 May '18, 23:37) i used unordered map...so it should be O(1) for insertion in my opinion....and for checking the second condition, I was iterating through this unordered map, which shall be less than 26 for many cases... @vijju123 (26 May '18, 23:49) Hmm, true, but only provided that the test cases arent anti hash :p . Still, I read somewhere that the constant for unordered map is a little high. The TL for problem is tight, so try to solve it without unordered map. (27 May '18, 15:08)
 0 @vijju123 can you please tell why my O(N^2) getting TLE? See this : https://www.codechef.com/viewsolution/18661895 answered 26 May '18, 23:48 4★sorb1997 150●9 accept rate: 10% 1 for(int i = 0; i < l; ++i) { for(int j = i + k - 1; j < l; ++j) { int count_t[26] = {0}, pr = 0; $O(26*{N}^{2})$ with expensive operations. Remember that accessing elements of $2-D$ array is slower than that of $1-D$ array. Everything matters when constraints are too tight. Refer to code under point 1. in my corner. (27 May '18, 15:22)
 0 Any Python Solution for 100 Points , i apply above both method but no one gives 100 Points . @vijju123 . answered 27 May '18, 12:11 21●3 accept rate: 0% The constraints were too tight. It seems Python needed more than 3x multiplier. But in general, we expect user to know which language is best to tackle a particular problem. (27 May '18, 15:27) @vijju123 , Atleast there should be 2 question in LunchTime or Cookoff which does not depends upon programming Language . Generally , for faster submission , user use python for it . we get more confused when there are tle in easy problem and start thinking less complexity solution (27 May '18, 22:36) True, I understand. I had discussions with them regarding this as well. You can give that feedback at the contest announcement thread for them to see. (28 May '18, 00:38)
 0 Time limit is very strict. Dont know this solution is giving TLE answered 27 May '18, 12:21 4★sonu_628 345●8 accept rate: 8% Invalid link. Cant see your solution. (27 May '18, 15:28) sorry this is the original link https://www.codechef.com/viewsolution/18674302 (29 May '18, 19:59) sonu_6284★
 0 Any cues on optimising it further in Python? It would be of great help if someone could share their working code in Python. Here is my implementation (Stuck at 40 points!) -- https://www.codechef.com/viewsolution/18676662 answered 27 May '18, 15:40 25●2 accept rate: 14%
 0 Can you tell where my code is going wrong? Code- https://www.codechef.com/viewsolution/18677799 answered 27 May '18, 17:26 2 accept rate: 0%
 0 For those asking, I solved this with Python (PYPY) for 100 points: https://www.codechef.com/viewsolution/18662068 Code could have been much more elegant (& more optimised) but a contest is always speed-coding. answered 28 May '18, 01:13 1 accept rate: 0%
 toggle preview community wiki:
Preview

By Email:

Markdown Basics

• *italic* or _italic_
• **bold** or __bold__
• image?![alt text](/path/img.jpg "title")
• numbered list: 1. Foo 2. Bar
• to add a line break simply add two spaces to where you would like the new line to be.
• basic HTML tags are also supported
• mathemetical formulas in Latex between \$ symbol

Question tags:

×1,173
×847
×644
×191
×11

question asked: 26 May '18, 15:01

question was seen: 1,368 times

last updated: 12 Jun '18, 17:50