PROBLEM LINK:Author: Lalit Kundu DIFFICULTY:SIMPLE PREREQUISITES:Dynamic programming, preprocessing, binary search, cumulative sums PROBLEM:You are given a string $S$ of length $N$ consisting only of $0$s and $1$s. You are also given an integer $K$. You have to answer $Q$ queries. In the $i$th query, two integers $L$ and $R$ are given. Then you should print the number of substrings of $S[L, R]$ which contain at most $K$ $0$s and at most $K$ $1$s where $S[L, R]$ denotes the substring from $L$th to $R$th characters of the string $S$. In other words, you have to count number of pairs $(i, j)$ of integers such that $L \le i \le j \le R$ such that no character in substring $S[i, j]$ occurs more than $K$ times. QUICK EXPLANATION:Let $\text{far}[i]$ be the last index $j$ such that $S[i, j]$ has at most $K$ $0$s and at most $K$ $1$s. Then for a given query $(L, R)$, the number of valid strings starting at index $i$ is $\min(R,\text{far}[i])i+1$ (for $L \le i \le R$). Therefore, the answer for the query $(L, R)$ is the following: $$\sum_{i=L}^R \left(\min\left(R,\text{far}[i]\right)i+1\right)$$ Note that $\text{far}[i]$ never decreases, so we can use binary search to find the maximum index $k$ such that $\text{far}[i] \le R$. The sum then becomes: $$\sum_{i=L}^k \left(\text{far}[i]i+1\right) + \sum_{i=k+1}^R \left(Ri+1\right)$$ Each of these is solvable in closed form, except possibly for range sums of $\text{far}[i]$. But for that, we can simply compute cumulative sums of $\text{far}[i]$. The algorithm runs in $O(N + Q \log N)$, but can be sped up to $O(N + Q)$ by getting rid of binary search. We'll describe this below. EXPLANATION:We will begin by describing a slow solution, and then work on improving it incrementally. We will call a string valid if there are at most $K$ $0$s and at most $K$ $1$s. Slow solutionFirst, we can simply test each substring and count all those that are valid. This can be accomplished, for example, with the following:
How fast does this go? This runs proportionally to the sum of the length of all substrings of $S[L, R]$, and one can easily compute it to be approximately $\frac{(RL+1)^3}{6}$. Since $RL+1$ is at most $N$, in the worst case the running time is approximately $\frac{N^3}{6}$ steps. This won't pass any of the subtasks! Breaking earlyThe above bruteforce solution can be optimized to pass subtask 2, by the observation that valid strings are at most $2K$ in length. Therefore, we can simply check all substrings that are at most $2K$ in length, and there are $O(NK)$ of them:
Notice that aside from reducing the limit of $j$ to $\min(2K+1,R)$, we have also added a Thus the algorithm now runs in $O(NK\min(N,K))$ time per query and $O(QNK\min(N,K))$ time overall, and passes subtask 2! Dynamic programmingIn fact, we can extend the argument we used about the However, we can exploit even more properties of the substrings we are inspecting! Notice that to check whether a substring $S[i, j]$ is valid or not, we only need to count the $0$s and $1$s in it. However, this can be computed easily once we know these counts for the substring $S[i, j1]$! Specifically, as we process the substrings $S[i, j]$ for a fixed $i$ and increasing $j$, we only need to increment
Now, this runs in $O(N\min(N,K))$ time per query, and $O(QN\min(N,K))$ time overall! However, using this might still not pass subtask 1, because there are up to $10^5$ queries. Thankfully, in subtask 1, $N$ is at most $100$, so there are only at most $N(N1)/2$ substrings $S[L, R]$. Thus, we can simply precompute the answer for all those substrings, and then answer the $Q$ queries with a simple lookup. This approach should run in $O(N^3\min(N,K)+Q)$ and should be able to pass subtask 1! Walking algorithmThere is still a way to optimize the above! Remember what we said above, that the substring of a valid string is also valid, and a superstring of any invalid string is also invalid. Therefore, if $S[i, j]$ is valid, then $S[i+1, j]$ is also valid!. Therefore, when we process the next $i$, we don't have to start $j$ from $i$ any more, because we already know many strings are valid from the previous $i$. Specifically, if $j'$ is the last $j$ such that $S[i, j]$ is valid, then $S[i+1, j']$ is also valid, so we can simply start iterating $j$ from $j'+1$ onwards. What's more, we can compute
Now, how fast does this go? Notice that there are still nested loops. However, every time the inner loop iterates, $j$ increases by $1$. Therefore, the inner loop runs in at most $RL$ steps, or $O(N)$. Therefore, the whole algorithm runs in $O(N)$ time per query, and $O(QN)$ time overall! This should be able to pass subtask 1, 2 and 3. Preprocessing and binary searchThe previous algorithm is too slow for subtask 4, because it still takes $O(N)$ time per query. In fact, this is probably the fastest we can do without some sort of preprocessing, because at the very least we have to read the string $S[L, R]$ to compute the answer, and this already takes $O(N)$ time. Thus, we will try to speed up the algorithm by preprocessing. First, note that the crucial part of the previous solution is finding, for each $i$, the first $j$ such that $S[i, j]$ is invalid, or $j > R$. However, by ignoring first the "or $j > R$" part, we see that the first such $j$ for each $i$ only depends on the string $S$! For a given $i$, let's denote that $j$ by $\text{far}[i]$ (if you read the quick explanation above, note that this $\text{far}$ is different from the $\text{far}$ there. Specifically, this one is larger by exactly $1$.). Thus, we can try to precompute $\text{far}$ at the beginning:
and use that for our queries:
The precomputation works similarly to the previous code and runs in $O(N)$, but the queries still takes $O(N)$ time each. But now, the However, we still got this nasty $\min$ term which we need to take care of. Thankfully, we can use the fact that $\text{far}[i]$ never decreases, to know that $\text{far}[i]$ will be $\le R$ at the beginning, and then as $i$ increases it will eventually exceed $R$, and once it does, it stays greater than $R$. Thus, it makes sense to find the last $k$ such that $\text{far}[k] \le R$, so the above expression becomes $$\begin{align*} &= \sum_{i=L}^k \text{far}[i] + \sum_{i=k+1}^R (R+1)  \left(\frac{R(R+1)}{2}  \frac{L(L1)}{2}\right) \\\ &= \sum_{i=L}^k \text{far}[i] + (Rk)(R+1)  \left(\frac{R(R+1)}{2}  \frac{L(L1)}{2}\right) \end{align*}$$ Now, we are almost down to $O(1)$ computation, aside from two things: finding the $k$ and a range sum for $\text{far}[i]$. But these are simple. First, $k$ can be computed with binary search, as the index such that $\text{far}[k] \le R < \text{far}[k+1]$, because $\text{far}[i]$ is monotonically nondecreasing. Also, to compute range sums for $\text{far}[i]$, one can simply use cumulative sums or prefix sums: Let $\text{sumfar}[i]$ be the sum of the $\text{far}$s until the $i$th index. Then the sum $\text{far}[i] + \text{far}[i+1] + \cdots + \text{far}[j]$ is simply $\text{sumfar}[j]  \text{sumfar}[i1]$! These are illustrated in the following code:
Using this, one can see that precomputation still runs in $O(N)$ time, and queries now run in $O(\log N)$ time each, due to the binary search. Thus, the overall algorithm runs in $O(N + Q \log N)$ time, which comfortably passes all the subtasks! Be careful with overflows! Use the right data type for this. Bonus: $\text{far}[i]$ and $\text{raf}[i]$The above solution already works, but we will introduce a final optimization here. Specifically, we will try to improve our algorithm to compute the $k$ in each query. Note that $k$ is the largest index such that $\text{far}[k] \le R$ or $k < L$. The key idea is that, by ignoring first the "or $k < L$" part, we see that $k$ is only dependent on $R$! Thus it would be nice if we are able to precompute the $k$s for all possible $R$s, and in fact it is easy to do so. Let's define a similar array $\text{raf}$, where $\text{raf}[R]$ is the smallest index $i$ such that $\text{far}[i] > R$. Using this array, one can compute $k$ simply as $\max(L,\text{raf}[R])1$ (we'll leave this to the reader to see why). Now, how do we compute all the $\text{raf}$s? The idea is that $\text{raf}$ is essentially the reverse of $\text{far}$, only that the direction is to the left rather than to the right. Thus, a similar $O(N)$ time walking algorithm can be used to compute it. We will leave this as an exercise for the reader, however, because we will show here a different way to compute it in $O(N)$ time, by exploiting the relationships between $\text{far}$ and $\text{raf}$. See the following pseudocode for details:
Finally, one can now see that it runs in $O(1)$ time per query, and $O(N + Q)$ time overall! Time Complexity:$O(N + Q)$ AUTHOR'S AND TESTER'S SOLUTIONS:
This question is marked "community wiki".
asked 16 Mar '15, 16:10

Here is my approach using Segment Trees: Let $A[i]$ (0based index) denote the number of valid strings in $[0,i]$. Let $B[i]$ denote the number of valid strings that start at an index <= i and end at an index > i Let $C[i]$ denote the minimum index $k(<=i)$, such that substring $[k,i]$ is valid. Then, the number of valid strings in $[i,j]$ is $A[j]A[i1]B[i1]$ Note that, $A[i]$ and $C[i]$ for every $i$ can be computed in $O(n)$. On the other hand, $B[i]$ can be computed as follows: I will explain with an example. Let the input string be $010010101$ and $k=3$. For each character at index $i$, lets count how much it contributes to $B[j]$ for $j < i$ and find a pattern. Lets start with $i=1$, since first character has no effect. Contribution : 1 String : 0 1 0 0 1 0 1 0 1 ^The character at $i=1$ contributes $1$ to $B[0]$ and the corresponding string being $01$. Moving on further, Contribution : 1 2 String : 0 1 0 0 1 0 1 0 1 ^Valid strings contributed to $B[1]$ are $010,10$. Valid strings contributed to $B[0]$ are $010$. Going on, Contribution : 1 2 3 String : 0 1 0 0 1 0 1 0 1 ^and on, Contribution : 1 2 3 4 String : 0 1 0 0 1 0 1 0 1 ^ Contribution : 1 2 3 String : 0 1 0 0 1 0 1 0 1 ^ Contribution : 1 2 3 4 String : 0 1 0 0 1 0 1 0 1 ^ Contribution : 1 2 3 4 String : 0 1 0 0 1 0 1 0 1 ^ Contribution : 1 2 3 4 5 String : 0 1 0 0 1 0 1 0 1 ^What we observe? For each $i$, we add the sequence $1,2,3..$ to $B$ for all the indexes in the range $[C[i],i1]$. We can use segmenttree with lazy propagation for this purpose. The whole precomputation takes $O(NlogN)$ time. Each query can be answered in $O(logN)$. Final timecomplexity being $O(NlogN + QlogN)$. Also, note that for a query $(L,R)$, if $C[R] <= L$, then answer is $(RL)*(RL1)/2$ answered 17 Mar '15, 13:11
Link to solution http://www.codechef.com/viewsolution/6443579
(17 Mar '15, 13:17)

here is my solution which precomputes the number of invalid strings for a particular index. And then applies some basic operations ignoring the invalid strings(which are not included in the given range), through BINARY SEARCH. And then from total substrings possible , those invalid are subtracted . answered 16 Mar '15, 16:39

Simpler Say : For every query (L,R) : Answer will be
where V is max index till which substring will be valid if started from index i. PreCompute Summation of V for every index. { O(n) } PreCompute Max index for which 'V' will be used in query For used L from min( V, L ), is just summation of n numbers (n*n+1/2) answered 16 Mar '15, 16:53

could any one plz tell me why my approach is given tle ...it just failing 2 subtask of 60 points answered 18 Mar '15, 01:03

@shubham54 .. the reason is that for every test case you are using multiple arrays and setting initially all of them to 0. This is what most probably is the cause of TLE in harder tasks of subtask4 . Have a look at these 2 solutions of mine. http://www.codechef.com/viewsolution/6432132 the above solution gives TLE in one of the subtask.The reason was i used 2 arrays "x" and "u" and was repeatedlty setting all of them 0 initial value which takes extra O(N) time now have a look at solution 2 http://www.codechef.com/viewsolution/6514748 I didn't do much change in it except that i used just one array and computed values through it. As you can see it got AC. It took me over 30 submissions to realise this. answered 18 Mar '15, 01:17

Hi All, Can anyone provide some tricky test cases that you must have tested on your own. With my approach, I am able to pass : 1 out of 2 for subtask 1 2 out of 2 for subtask 2 1 out of 2 for subtask 3 4 out of 9 for subtask 4 but getting WA otherwise. I couldn't seem to prove my algo incorrect. Any help is highly appreciated. answered 18 Mar '15, 20:13

Can someone tell me what is wrong with my algorithm? Exceeding time limit is fine but for some cases it is giving wrong answer. My algo involves adding answer to previous answer and sum of count of '0' and count of '1'. It is similar to dynamic algo mentioned above, where in each iteration either count1 or count0 is incremented and answer is incremented, which means answer is simply sum of count0 and count1. http://www.codechef.com/viewsolution/6774546 answered 14 Apr '15, 01:55
At first glance  you are using int, while answer can be quite large (assume that N=10^5 and K=N).
(14 Apr '15, 01:58)
But i have this python code also using same algo http://www.codechef.com/viewsolution/6774494
(14 Apr '15, 10:07)

Can some one, please explain the algorithm, I understood the implementation! answered 08 Sep '17, 23:27

Wow... Such a simple question and yet there's so much that can be done for optimisation! Beautiful!