You are not logged in. Please login at www.codechef.com to post your questions!

×

SBSTR - Editorial

PROBLEM LINK:

Div1
Div2
Contest

Setter- Hasan Jaddouh
Tester- Jakub Safin
Editorialist- Abhishek Pandey

DIFFICULTY:

SIMPLE

PRE-REQUISITES:

Frequency Array concept's application on string, looping, logical reasoning

PROBLEM:

A string is said interesting if, it has at least $K$ distinct characters and frequency of all those characters is same. Given a string $S$, find number of interesting sub-strings in it.

QUICK EXPLANATION:

If frequency of all characters is same, it goes without saying that the value of maximum frequency is also same as value of any other frequency. Hence, we can simply generate all sub-strings in $O({N}^{2})$ and check the conditions in $O(1)$ by maintaining a $cnt$ variable to check number of distinct characters in the sub-string, along with using $MaxFreq*cnt==length$ $of$ $string$ to check if frequencies of all characters are same. (Note that $cnt$ is storing the number of distinct characters in the string).

EXPLANATION:

This is a fairly straightforward and simple problem. The editorial will have a single section, which will describe the two approaches used by setter and tester.

1. Full Solution-

By the constraints, it is clear that an $O({S}^{2})$ can pass, provided we dont indulge in very expensive operations.

Let me first put up the main code for reference and then refer to it for explanation. Also, let me use $S(i,j)$ to denote "Substring of $S$ starting at $i$ and ending at $j$"

for(int i = 0; i < N; i++) {
            int occ[26] = {};
            int mxocc = 0, cnt = 0;
            for(int j = i; j >= 0; j--) {
                if(occ[S[j]-'a'] == 0) cnt++;
                mxocc = max(mxocc, ++occ[S[j]-'a']);
                if(cnt >= K && mxocc*cnt == i-j+1) ans++;
            }
        }

What we are doing is, we are fixing the value of $i$, i.e. end point of the sub-string. From there, we traverse backwards, cumulatively storing information and checking conditions for the $S(i,j)$ and updating the answer.

To those who were not able to solve the question, a word on cumulativeness is given below. (Please note, we will be discussing a general example, which has no relation to code above)

View Content

The above solution works in $O({N}^{2})$. There is another alternative solution, which will work in $O(26*{N}^{2})$. However, the constant factor of $26$ is a bit high for constraints of $|S|=7000$, hence certain constant optimizations will be needed.

Let me denote "Frequency of characters in $S(i,j)$" as $Freq(S(i,j))$ .This approach will use the fact that,-

$Freq(S(i,j+1))=Freq(S(i,j))+Freq(S(j,j))$

We already have $Freq(S(i,j))$. Again , instead of re-computing things for entire sub-string again, use the we simply update the previous values of frequency array. Then we check if there are at least $K$ distinct characters in string or not. IF AND ONLY IF there are, then we check the frequencies of characters using frequency array. Note that skipping this optimization can lead to TLE, as we would waste $26$ iterations checking the next condition for nothing.

We will loop through all characters of alphabet, and check for their frequencies. The allowed frequencies are $0$ i.e. character not resent in string (recall we have already checked that at least $K$ distinct characters are present), or if the character is present should be some value $P$ which should also be the frequency of all other characters in the string.

This solution actually runs in $O(A*{N}^{2})$ where $A$ is size of your alphabet/number of characters allowed in alphabet. As given in question, size of alphabet can be up to $26$, hence this approach works.

As an homework, try to-

  • The first approach (whose code I posted) traverses substring from $S(j,i)$ to $S(j-1,i)$...$S(0,i)$. Make a solution traversing it from $S(i,j)$ to $S(i,j+1)$ . In other words, the code traverses substring in a right to left fashion, you should do in opposite direction.
  • Write/Implement the code for Second approach. Chef Vijju's corner has the solution if you're stuck.

SOLUTION:

For immediate availability of setter and tester's solution, they are also pasted in the tabs below. This is for your reference, and you can copy code from there to wherever you are comfortable reading them. You now dont have to wait for @admin to link solutions :)

Setter

View Content

Tester

View Content

Editorialist's Solution will be put up on demand.

$Time$ $Complexity=$ $O({N}^{2})$ or $O(\alpha* {N}^{2})$ where $\alpha$ is number of characters allowed in alphabet/language.

CHEF VIJJU'S CORNER :D

1. Solution for 2nd Approach-

View Content

2. Can we not do better than $O({N}^{2})$ for this problem? Why/Why not?

3. Upto what value of $S$ (length of string), will $O({N}^{2})$ pass? Is $2*{10}^{4}$ a good estimate? Try to explore the mathematics behind it.

4. Tester's Notes-

View Content

5. A very good problem on topic of frequency array is this one. Also, its editorial (unfortunately for you, again by me xD) has more problems on same topic.

This question is marked "community wiki".

asked 26 May '18, 15:01

vijju123's gravatar image

5★vijju123 ♦♦
15.4k12066
accept rate: 18%

edited 12 Jun '18, 17:50

admin's gravatar image

0★admin ♦♦
19.8k350498541

link or didnt happen...xD

(26 May '18, 23:53) vijju123 ♦♦5★

CHEF VIJJU'S CORNER:-

ans:-2 There are O(n^2) substrings in a string and we have to check every substring. So I think we cannot do better than O(n^2)

ans:-3 A computer does 10^8 computation/second on an average. So 2*10^4 is seems to good estimation, Because of n^2=10^8 ==> n=10^4

@vijju123 am I right?

link

answered 27 May '18, 12:37

vikram_91's gravatar image

3★vikram_91
213
accept rate: 0%

edited 27 May '18, 15:06

Ans2 - The reasoning is not correct. I can use the same reasoning to claim that "For sorting the numbers, we know that there are $N!$ permutations. We need to try each and every one of them to see if that permutation is sorted or not. Hence, sorting takes $O(N!)$ time."

Ans 3- Yes, but $2*{10}^{4}$ isnt a very good estimate for TL of 1 second, as depending on implementations, it can take 0.4 - 4 seconds.

(27 May '18, 15:16) vijju123 ♦♦5★

Ans2 - I mean we have to check every substring if it is interesting or not. There are O(n^2) substrings, so checking every substring will take at least O(n^2) time. Is there any better way to do this in less than O(n^2)?

Ans3 - In what situation string size 2∗10^4 will take 4 seconds?

(28 May '18, 13:15) vikram_913★

For Ans2- No, you may not need to check all substrings. Perhaps its possible from data of sub-strings processed till now that you can say for sure that "We dont need to check next $K$ sub-strings as as they can-never-be/will-always--be a part of answer". Like, read about centroid decomposition. It answers queries in $O(NLogN)$, and not $O({N}^{2})$ as it does not have to check all paths.

Ans3- If operations involved are very cheap (eg-bitwise operations), then ${10}^{9}\approx 1sec$. Other end is, if very expensive operations are used very frequently. $4sec$ is a loose upper-bound.

(28 May '18, 21:22) vijju123 ♦♦5★

Thank you @vijju123

(30 May '18, 10:37) vikram_913★

In this problem some solution with complexity O(nn26) also passed. link https://www.codechef.com/viewsolution/18669296 and this https://www.codechef.com/viewsolution/18660196.

link

answered 26 May '18, 23:10

akash_321's gravatar image

4★akash_321
765
accept rate: 0%

edited 26 May '18, 23:11

1

Yes, I know. I described it in editorial, however, some other implementations (which used more expensive computations, or didnt optimize) failed as well.

(26 May '18, 23:12) vijju123 ♦♦5★

@vijju123 can u provide quick brief explanation for this:- http://www.spoj.com/problems/MMASS/

plzz

(26 May '18, 23:13) code_man3★

Well that should not have happened. Why there is always some problem in codechef contest with weak test cases?

(26 May '18, 23:16) akash_3214★
1

I wil @code_man , but first check the answer by @l_returns on it. I think it satisfies what you want.

@akash_321 - I disagree there. I feel since setter and tester's solution themselves are $O({N}^{2})$, they shouldnt have even tried to stop the $O(26*{N}^{2})$ ones. When $2$ solutions are asymtotically similar, then even constant and random optimizations matter.

(26 May '18, 23:29) vijju123 ♦♦5★

@vijju123 then what is point of n<=7000 why not n<=1000 , just an extra break statement?

(26 May '18, 23:42) akash_3214★

The setter and tester did not intend that either, but they cannot indefinitely raise constraints or reduce TL, else some correct solutions will not pass. Ultimately they decided that, for second easiest problem of ltime, its ok if a few brute forces (optimized ones) pass. The tester himself made us aware by making such a solution. Its not that they were unaware of this, but its just that no correct solution should get TLE

(26 May '18, 23:51) vijju123 ♦♦5★
showing 5 of 6 show all

Why is this code leading to tle....(my code is easy to be read).... https://ideone.com/e.js/DugpK4

link

answered 26 May '18, 23:17

sharpowl_06's gravatar image

4★sharpowl_06
112
accept rate: 0%

Your solution is $O({N}^{2} LogN)$ in my opinion, which is worse than $O(26*{N}^{2})$. Map data structure has a $LogN$ factor involved.

(26 May '18, 23:37) vijju123 ♦♦5★

i used unordered map...so it should be O(1) for insertion in my opinion....and for checking the second condition, I was iterating through this unordered map, which shall be less than 26 for many cases... @vijju123

(26 May '18, 23:49) sharpowl_064★

Hmm, true, but only provided that the test cases arent anti hash :p . Still, I read somewhere that the constant for unordered map is a little high. The TL for problem is tight, so try to solve it without unordered map.

(27 May '18, 15:08) vijju123 ♦♦5★

@vijju123 can you please tell why my O(N^2) getting TLE? See this : https://www.codechef.com/viewsolution/18661895

link

answered 26 May '18, 23:48

sorb1997's gravatar image

4★sorb1997
1509
accept rate: 10%

1

for(int i = 0; i < l; ++i) { for(int j = i + k - 1; j < l; ++j) { int count_t[26] = {0}, pr = 0;

$O(26*{N}^{2})$ with expensive operations. Remember that accessing elements of $2-D$ array is slower than that of $1-D$ array. Everything matters when constraints are too tight. Refer to code under point 1. in my corner.

(27 May '18, 15:22) vijju123 ♦♦5★

Any Python Solution for 100 Points , i apply above both method but no one gives 100 Points . @vijju123 .

link

answered 27 May '18, 12:11

blackerror's gravatar image

3★blackerror
213
accept rate: 0%

The constraints were too tight. It seems Python needed more than 3x multiplier. But in general, we expect user to know which language is best to tackle a particular problem.

(27 May '18, 15:27) vijju123 ♦♦5★

@vijju123 , Atleast there should be 2 question in LunchTime or Cookoff which does not depends upon programming Language . Generally , for faster submission , user use python for it . we get more confused when there are tle in easy problem and start thinking less complexity solution

(27 May '18, 22:36) blackerror3★

True, I understand. I had discussions with them regarding this as well.

You can give that feedback at the contest announcement thread for them to see.

(28 May '18, 00:38) vijju123 ♦♦5★

Time limit is very strict. Dont know this solution is giving TLE

link

answered 27 May '18, 12:21

sonu_628's gravatar image

4★sonu_628
3458
accept rate: 8%

Invalid link. Cant see your solution.

(27 May '18, 15:28) vijju123 ♦♦5★

sorry this is the original link https://www.codechef.com/viewsolution/18674302

(29 May '18, 19:59) sonu_6284★

Any cues on optimising it further in Python? It would be of great help if someone could share their working code in Python.

Here is my implementation (Stuck at 40 points!) -- https://www.codechef.com/viewsolution/18676662

link

answered 27 May '18, 15:40

thecodearrow's gravatar image

3★thecodearrow
252
accept rate: 14%

Can you tell where my code is going wrong? Code- https://www.codechef.com/viewsolution/18677799

link

answered 27 May '18, 17:26

himanshu_hg's gravatar image

3★himanshu_hg
2
accept rate: 0%

For those asking, I solved this with Python (PYPY) for 100 points: https://www.codechef.com/viewsolution/18662068

Code could have been much more elegant (& more optimised) but a contest is always speed-coding.

link

answered 28 May '18, 01:13

pickleweasel2's gravatar image

4★pickleweasel2
1
accept rate: 0%

toggle preview
Preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported
  • mathemetical formulas in Latex between $ symbol

Question tags:

×1,173
×847
×644
×191
×11

question asked: 26 May '18, 15:01

question was seen: 1,368 times

last updated: 12 Jun '18, 17:50