GOODPREF -Editorial

vijju123 · April 8, 2018, 7:27pm

PROBLEM LINK:

Div1
Div2
Practice

Setter- Praveen Dhinwa
Tester- Misha Chorniy
Editorialist- Abhishek Pandey

DIFFICULTY:

EASY

PRE-REQUISITES:

Strings, Basic Math, Simulation.

PROBLEM:

Given a string S , with only characters being 'a' and 'b ', find number of good prefixes in the string T which is made by concatenating N copies of S, i.e.-

T = \underbrace{S + S + \dots + S}_{N\text{ times}}

A prefix is a good prefix if and only if number of occurrences of 'a' is strictly greater than the number of occurrences of 'b'.

QUICK EXPLANATION:

We make cases for this problem. If frequency of 'a' is equal to frequency of 'b' in string, then calculating answer is simple- for we can simply use Ans=G(S)*N where G(S) denotes number of Good Prefixes in S.

If frequency of 'a' is more than to frequency of 'b' , we notice that after at most 1000 appendations, each new appendation will contribute maximum possible value only, which is K (K=Length of string) .

Similarly we can see that if frequency of 'a' is less than to frequency of 'b', after 1000 appendations, each new appendation will contribute 0 to answer.

So if frequency of character 'a' is not equal to that of 'b', we make a string T by appending S to it 1000 times, and then use mathematical formulas to find contribution of every other appendation.

EXPLANATION:

As usual, we will be discussing Editorialist’s and tester’s approach. Editorialist’s approach is based on same lines as that of setter- and hence, will form main part of editorial. We will discuss better optimized solutions (tester’s approach) and common mistakes of contestants at the bonus part of editorial

Before that, please first get acquainted with the notations I will be using.

G(S)= Number of Good Prefix in string S
freq(a,S)= Frequency of character ‘a’ in string S.
freq(b,S)= Frequency of character ‘b’ in string S.
{S}^{p}= String formed by concatenation S p times.

1. Editorialist’s/Setter’s Solution-

If you want, you can scroll down to refer to my solution side by side while reading the editorial.

The first thing to note are the constraints. 1\le N \le {10}^{9} while 1\le |S| \le {10}^{3}.

We start by calculating number of Good Prefixes in S , i.e. G(S). Along with that, we also find freq(a,S) and freq(b,S). Now we make cases-

a. When freq(a,S) == freq(b,S)-

In this case, there is no affect of previous appendations of S on incoming/new appendations of S. In other words, when freq(a,S) \neq freq(b,S), then there is some change in value of (freq(a,S)-freq(b,S)) after each appendation of S to T, which affect the contribution of any future appendation of S towards G(T). In most basic words, it means that since freq(a,S) \neq freq(b,S), then number of 'b' in T will not increase at same rate as number of 'a' at each appendation, which will affect contribution of next appendation to the final answer.

This is not the case when freq(a,S) == freq(b,S). Hence, each appendation of S will contribute G(S) towards the final answer. There are N appendations of S and we already found G(S) by simple looping earlier. Hence, for this case, Answer=G(S)*N

b. When freq(a,S) < freq(b,S)-

In this case, because freq(a,S) < freq(b,S), each new appendation of S to T will decrease (freq(a,S)-freq(b,S)). In other words, the number of 'b' in T will increase more quickly than number of 'a', which will reduce the contribution of every future appendation of S towards final answer.

We see that, after every appendation, the contribution of next appendation must reduce by at least 1 , (\because freq(a,S) < freq(b,S)). Remember that length of S is only upto 1000. This simply means, that after at most 1000 appendations, the contribution of incoming appendation reduces to 0. (In fact, 1000 is a loose upper bound, we can go closer to the real value, but for sake of explaining lets take it 1000 right now )

So, we form the string T by appending string S min(N,1000) times as-

T = \underbrace{S + S + \dots + S}_{min(N,1000)\text{ times}}

and our answer is equal to G(T) as any future appendation has 0 contribution towards final answer.

Answer=G(T)

b. When freq(a,S) > freq(b,S)-

Clearly, each new appendation of S to T will increase (freq(a,S)-freq(b,S)). Thus, the number of 'a' in T will increase more quickly than number of 'b', which will increase the contribution of every future appendation of S towards final answer.

The maximum possible contribution of an appendation to our answer can be |S|, i.e. the length of string S. Again, as seen above, after at most 1000 appendations, the next appendation will reach its maximum contribution. Hence, we do same as above, we form T by appending string S min(N,1000) times as-

T = \underbrace{S + S + \dots + S}_{min(N,1000)\text{ times}}

Now each future appendation will have maximum contribution, equal to |S|. Hence, our answer will be-

Answer=G(T)+|S|*max(0,N-1000)

Now, what is the time complexity? How many of you would agree if I said-

Time Complexity= O(min(N,1000)*|S|) \equiv O(1000*|S|) \equiv O(|S|)

Is it correct?
.
.
.
(PS: The above time complexity is a lie )

SOLUTION:

Setter
Tester - O({|S|}^{2})
Tester - O(|S|)
Editorialist - O(???)

CHEF VIJJU’S CORNER:

**1.**Most of the contestants did a good job in framing the simulation part correctly, checking for overflows, and coming at right expressions. But they still got a TLE. And let me reveal the culprit here! It was-

T=S+T

Yes, thats right! This expression should had been T+=S. The + operator takes time comparable to O(|S|+|T|) while the += symbol takes time comparable to O(|S|). Treat it like “Creating a new array with characters of T and S v/s Adding |S| characters to back of char-array T .” For reference, have a look at this solution. Replace the + operator with += and see the time diffference!

2. Tester’s O(|S|log(|S|)) solution is discussed here. When freq(a,S)> freq(b,S), what he does is, keep a track of freq(a,S)- freq(b,S) at various points of the string, and then sort them. Now, we keep a variable curr which stores the count excessive 'a ' we get after each appendation. He uses this to check at most how many prefixes more he can get by the condition have[ptr - 1] + cur > 0. Depending on that, he adds ans += (int)have.size() - ptr; to the answer. In other words, tries to go checks the least value in of freq(a,S)- freq(b,S) which curr, number of excessive 'a' s can afford, and adds to answer accordingly. Same approach used for freq(a,S)< freq(b,S) case, which is left for the readers to work out :).

3. Testers O(|S|) solution is a bit similar. Firstly, Misha eliminated sorting, and replaced it with a cumulative summation’d prefix array. Meaning, he stores the frequency of each value of freq(a,S)- freq(b,S) and does a cumulative sum of that array. Again, curr holds freq(a,S)- freq(b,S). Now, his expression is ans += cnt[2 * SHIFT - 1] - cnt[SHIFT - cur];, meaning “Max possible contribution - Contributions which cannot be afforded due to low curr”. The solution is same in essence to the first one, but he cleverly got rid of sorting, leading to a much simpler O(|S|) approach.

4. Now you have seen two optimized approaches, I will have a question for you. Have a detailed look at the time complexity I mentioned. No doubt, my solution does ,plain and simple, have a complexity of at least O(min(N,1000)*|S|) , but dont you get something fishy? There is a, very common, and intentional error involved here :). Note that, there is no sense in the "1000" I kept in O(min(N,1000)*|S|). Ask yourself, what is this 1000? It is nothing but to denote |S|!! Hence, the real complexity would be O(min(N,|S|)*|S|) \equiv O({|S|}^{2}). Dont be like this editorialist when computing time complexity xD.

5. The string in this question was made up of only 2 characters. Replace 'a' and 'b' with 1 and 0 and you have a binary string. Some questions on similar topics-

shivam_g1470 · April 17, 2018, 8:38pm

I guess we could save space by not actually appending the s to T rather running a loop till (n or 1000) * len(s). and taking a mod (n or 1000) each time we want to know that character.

ankit1573 · April 18, 2018, 1:59pm

Can anyone tell me what is wrong with my solution?

https://www.codechef.com/viewsolution/18299503

souradeep1999 · April 19, 2018, 1:21am

I use binary search to find that position after which the numbers of good prefixes becomes constant if possible.I think this may be a unique approach for this particular problem.If anyone already did this approach please share

link of my solution: CodeChef: Practical coding for everyone

If any one like my approach please upvote :).

if there is any query related to my approach please comment

abhidoeslinux · April 20, 2018, 4:26pm

Hello, I have been trying to find out the intuition behind “at most 1000 appendations” Could anybody give a proof?

l_returns · April 22, 2018, 10:10am

My solution was O(|S|) have a look…
https://www.codechef.com/viewsolution/18070826
… let’s say string is abbaabb so
I stored 1 0 -1 0 1 0 -1 in an array and computed the ans with ceiling function…
basically I calculated that for how many repetition each character will contribute…

skipper39 · April 27, 2018, 3:16pm

In tester’s O(SlogS) solution, why vector “have” is sorted?

topcoder31 · May 20, 2018, 4:03pm

I also used binary search with prefix sum array. and then simple loop.
I guess less and simple code

https://www.codechef.com/viewsolution/18606690

vijju123 · April 17, 2018, 9:08pm

Perhaps .But I preferred a solution which is easy to explain and for others to understand

vijju123 · April 18, 2018, 4:38pm

So many people got WA on final testcase of subtask 2 during the contest, and even now…And… that TC was suggested by me…xD xD

#Evil_Evil_Editorialist hahahaha

vijju123 · April 18, 2018, 7:18pm

Nice job @aryanc403 . My TC was based on one of the common mistakes I predicted . Looks like you did a good job here

vijju123 · April 18, 2018, 7:19pm

Solutions are now linked. You can have a look

ankit1573 · April 18, 2018, 11:13pm

Thank you so much @aryanc403

vijju123 · April 19, 2018, 2:06am

For you escaping them. Its definitely allowed to help and give failing cases. But not during the contest!! xD

vijju123 · April 19, 2018, 4:04pm

Commented codes are always appreciated. . Upvote gives 10 karma. Will give 15 more if you properly document it

vijju123 · April 20, 2018, 6:09pm

Ask yourself 3 questions.

If freq(‘a’)\neq freq(‘b’) , what is the minimum change in contribution of next appendation w.r.t. first one?
What is the largest possible length of original string S?
How many appendations will it take for all future appendations to reach |S| or 0, provided that the contribution behaves monotonically (eg if increasing, always increase and vice cersa).

divik544 · April 22, 2018, 7:47am

i also did it with binary search CodeChef: Practical coding for everyone

vijju123 · April 22, 2018, 4:39pm

Good job

vijju123 · April 27, 2018, 4:35pm

Refer to O(|S|) solution for now. While trying to explain, I noticed @admin linked the wrong solution there. She linked the O({S}^{2}) solution there, I will get that fixed. Sorry for the inconvenience

adist98 · October 24, 2019, 5:29pm

Really interesting question and a really interesting and thought provoking approach.