Problem LinkAuthor: Bhuvnesh Jain Tester: Mark Mikhno Editorialist: Bhuvnesh Jain DifficultyMEDIUM PrerequisitesBinary Search, Greedy, Segment trees. ProblemYou are given a sequence of integers $A_1, A_2, \dots, A_N$. For an integer $K$, let's define a $K$compressed sequence $B_1, B_2, \dots, B_N$ as follows:
For a given integer $S$, find the number of values of $K$ ($0 \le K \le N$) such that the sum of elements of the $K$compressed sequence does not exceed $S$. ExplanationLet us first understand how to construct a $Kcompressed$ array $B$ for given array $A$ and value $K$. For this, we will iterate over the numbers in increasing order and try to assign the smallest number not yet assigned to any number smaller than it until now. This will be optimal as we try to assign the smallest possible number to each number and iterating in increasing order ensures that if the number assigned to a given number can't be decreased further, the sum can't be minimised further. This proves our greedy strategy. Let me explain it through an example, where $A = [4 2 8 1 4 3 8 1]$ and $K = 3$. Let the current array $B$ be $[, , , , , , , ]$. We now group the numbers having the same value and iterate in increasing order. The following are the values of the array $B$ after each step:
Thus, the minimum possible sum is $(3+2+4+1+3+2+4+1) = 20$. Make sure you are clear with the idea before you proceed further. Now there are some issues which might occur while implementing the above approach. The first thing is that we should deal with all the numbers having the same value together should be dealt together instead of simply iterating in increasing order. One simple counterexample for this is the array $A = [10, 30, 30, 20, 10]$ and $K = 1$. The compressed array should be $B = [1, 3, 3, 2, 1]$ itself but if we simply iterate over numbers in increasing order, we might end up getting array $B$ as $[1, 2, 3, 2, 1]$ or $[1, 2, 2, 2, 1]$ which is again highly dependent on your implementation. The correct logic is to first group the numbers by their values. Find what value you might end up giving them based on their ranges. Then, we need to be sure that the value we might give is correct or now. For this, we again iterate over the numbers and group them if their ranges coincide with each other. We assign it the all the numbers in the group the largest number we thought of assigning to any number in the range. For the example $A = [10, 30, 30, 20, 10]$ and $K = 1$, the process is as follows:
Thus, we can easily build the $Kcompressed$ algorithm using the above ideas. But how fast can we do it? Doing it naively will take $O(N^2)$ as it will require you to find the smallest number not assigned yet in a range, taking $O(N)$ for this step alone. But if you restate this problem, it is similar to the following $2$ operations:
This is a very familiar problem which can be solved using segment trees in $O(\log{N})$ for each operation. You can read about it here. Thus, we can built a $Kcompressed$ array for given array $A$ and $K$ in $O(N * \log{N})$ complexity. To find what possible values of $K$ will lead to compressed array having sum less than $S$, we can simply iterate over all possible values of $K$ and update the answer. This approach has a complexity of $O(N * N * \log{N})$ which is enough to pass the first $2$ subtasks. The next thing to note that we can binary search on the answer. To prove this, we need to prove that the minimum sum for $Kcompressed$ array does not decrease with increasing $K$. This again relies on the way we build our $Kcompressed$ array using a greeedy algorithm. Since we make sure that each number is compressed to the smallest possible value and the sum can't decrease with increasing $K$ as the smallest number which might get assigned to a number can only increase if it's range (or window) increases. Thus, the overall complexity of the approach reduces to $O(N * {\log}^2{N})$. This is enough to pass all the subtasks as well. Once, you are clear with the above idea, you can see the author implementation below for help. Feel free to share your approach as well, if it was somewhat different. Note from Author:The test case data was bit weak for the small subtasks where some wrong greedy approaches passed as well. Though the large test case ensured, wrong solutions could not pass for the full solution, but still, I failed in generating stronger test data for smaller ones. This is not completely our fault as there are many ways the greedy solution for this compression could be written where even index errors could happen which implementing. I would like to thank all the people who testing their solutions as well and helped me strengthen the test as well (Special thanks to Stepan). But still, we alone cannot come up with all possible greedy solution which one might write, so stronger test could not be prepared by me. By the time, I came to know about this it was already to late for a rejudge, but I will take care of it in the future problem. Also, the problem statement seemed quite tough to comprehend for most of the contestants as well. User acmonster even pointed out a flaw in the English statement as well. Below were his comments: "I guess that the problem (and test data) actually requires that sequence B should preserve all the relative sizes for each index pair (i, j) such that i  j <= K. In other words, if A[i] < A[j], we should have B[i] < B[j]; if A[i] = A[j], we should have B[i] = B[j], etc. If this is correct, the fix does not seem to be sufficient. Consider A = (1, 1, 1, 2, 3), B = (2, 1, 1, 1, 2), and K = 2: both sequences has a "signature" of (2, 2, 2, 2, 2), but, again, the optimal sequence B does not preserve A[1] = A[2] = A[3] < A[4] < A[5]." Though most of the contestants got what the problem statement meant, I will make sure to make better statements in future as well. Time Complexity$O(N * {\log}^2{N})$ Space Complexity$O(N)$ SOLUTIONS:Author's solution can be found here. Tester's solution can be found here. Editorialist's solution can be found here.
This question is marked "community wiki".
asked 13 Aug '18, 15:01

I think that the solution is incorrect, since the cost of the Kcompressed sequence is not monotonic in K. Here is a simple counterexample. Consider sequence A = (2, 1, 2, 3, 4). One can verify that one of its 1compressed sequences is (2, 1, 2, 3, 4) itself, which has a cost of 12. However, one of its 2compressed sequence is (3, 1, 2, 2, 3), which, in fact, has a lower cost of 11. answered 16 Aug '18, 13:44
Also note that in the K = 2 case, though we have A[1] = A[3] in the original sequence, we can achieve a smaller sum only if we choose to assign different values to B[1] and B[3]. This shows that the greedy algorithm mentioned in the solution is not optimal.
(16 Aug '18, 13:50)
Hey @acmonster , if we assign $B_1$ and $B_3$ different values, then doesn't point $2$ of the question get violated? It says that if there are $X$ elements smaller than $A_i$ in that range, then this must be same for $B_i$ as well. For your sequence, there is $1$ element smaller than $A_1$ in range $[1,3]$, but $2$ elements smaller than $B_1$ in range $[1,3]$ which I feel make your sequence invalid. Can you please have a look? :)
(16 Aug '18, 18:13)
Hi @vijju123, the problem statement was updated and it now counts the number of numbers "smaller than or equal to" A[i] and B[i].
(16 Aug '18, 18:40)
@vijju123 Moreover, under the "smaller than" version of the problem that you referred to, we can still construct a counterexample of A = (3, 4, 3, 2, 1), whose 1compressed sequence is (1, 4, 3, 2, 1) (with sum 11), and whose 2compressed sequence is (1, 3, 2, 2, 1) (with a smaller sum of 9).
(16 Aug '18, 18:44)
Oh! I missed the statement update. Sorry, and thanks for telling that :). I will forward your concern to @mgch to have a look.
(16 Aug '18, 19:10)
1
Why ?? I statement is update. But there is no announcement. Rule for problem setter should be made. I he wants to add a "." also. Then he will be allowed to do so only after making an announcement on contest page. Btw when this update was made ??
(17 Aug '18, 13:37)
My comment on Magic Set July'18 which nobody bothered to reply  "Test Case 1  Subsequences of [1,2] are [1],[2],[1,2] sym of all these is 6. (Divisible by 3). So answer should be 1? Is answer 0 correct?" "When was "each" added in problem statement? And why is no announcement made after this change?" . I remember reading statement multiple times because I got intution that this is what statement is asking. But each was not present there.
(17 Aug '18, 13:41)
showing 5 of 7
show all

Managed to optimize my bruteforce approach and solve it without segment tree. Time and space complexity remains the same as that of this editorial. Binary search was the key in solving this problem. My solution: https://www.codechef.com/viewsolution/19673140 answered 15 Aug '18, 00:18

I doesn't look to me like the described greedy solution works. I also had the intuition that something like that would work (e.g. "all equal values in A that are within each other's range will be the same in B") but I managed to disprove all of these intuitions that I had. For example, for A = [1, 1, 2, 2] and K = 1, the setter's solution computes B = [1, 1, 2, 2], but in fact, B = [2, 1, 1, 1] with a smaller sum of 5. So I wonder if someone can explain a correct algorithm to compute the Kcompressed sequence in O(nlgn) time. answered 18 Aug '18, 13:43
@buda It is already stated in editorial that english version of statement is wrong and thus the editorial apporach too. My intended task was exactly the same as @acmonster version of the statement, mentioned in editorial but it seems there was a huge fault here from my side.
(19 Aug '18, 19:45)

https://www.codechef.com/viewsolution/19654812 Can anyone please help me.My solution passed all the solution except the last one which is giving WA. @include_sajal my solution was also giving TLE and WA first.I got rid of the TLE but there is still one WA.Could please tell what approach of handling those testcase. answered 13 Aug '18, 16:44
Please tell me your B array for the input
(14 Aug '18, 17:27)
at what value of k
(15 Aug '18, 03:38)
Oh yeah, sorry, for k=9.
(16 Aug '18, 01:04)

hrishabh0901 instead of iterating over increasing values of K,you can do bsearch as with increasing k,your sum would also be non decreasing..... answered 14 Aug '18, 11:24

Can anyone please explain the fault in my code. I coded as per the editorial, using only the greedy approach (No Segment Tree) and getting AC on 6 subtasks out of 10 but no complete AC in any task. @likecs it would be great if you could kindly help out. answered 14 Aug '18, 16:39
There is a flaw in calculating colliding interval. You have considered that it can collide only once but there can be multiple places where it might collide. Eg : [1, 1, 1, 1, 1] and k = 2. All ranges are colliding but you don't consider such cases. (Note, your solution will be correct for the case I provided, it is just to give you intuition where it might fail).
(19 Aug '18, 19:50)

Please help me to find counter test case of my code of the problem KCOMPRES. Only two task remaining. My solution included segment tree approach. My Solution answered 14 Aug '18, 17:37
You might want to change your custom comparator function. Particularly, Google more to find out about it.
(15 Aug '18, 00:06)
I Changed, But didn't work :( Also, Thank you for your time which you give it.
(15 Aug '18, 16:04)

can someone explain the kcompress part of the question...thanks in advance "for each valid i, if there are exactly X numbers smaller than or equal to Ai in the subsequence Amax(1,i−K),…,Amin(N,i+K), then there must be exactly X numbers smaller than or equal to Bi in the subsequence Bmax(1,i−K),…,Bmin(N,i+K)." answered 14 Aug '18, 22:52

@acmonster I didn't get what you are trying to say...we want to find the k compressed sequence with lowest sum so 1compressed sequence with minimum sum for {2 1 2 3 4} is {1 1 1 1 1} with minimum sum of 5 while 2 compressed sequence with minimum sum is {3 1 2 2 3} with sum 11. Why are u comparing any random k compressed sequence? answered 16 Aug '18, 20:18
@acmonster ohh sorry!... {1 1 1 1 1} is 0 compressed sequence...yes you are right, then we cannot use binary search then.
(17 Aug '18, 07:20)

Hi @byomkeshbakshy, you might want to check that the 1compressed sequence with minimum sum for (2, 1, 2, 3, 4) is (2, 1, 2, 3, 4) itself, not (1, 1, 1, 1, 1). This shows that the minimum sum of Kcompressed sequence is not necessarily nondecreasing in K. answered 17 Aug '18, 06:28

@likecs Please help me with this solution... What can be test cases It may be failing upon? https://www.codechef.com/viewsolution/19759987 answered 17 Aug '18, 11:27

FWIW, I also thought binary search would be useful here upon seeing the problem, but then wrote a bruteforce solver and found counterexamples. One's intuition can be misleading. answered 17 Aug '18, 22:21

In the editorial, it is written to check whether the ranges of the identical elements coincide with each other, but it does not give the criteria for coinciding ranges. Example: For a = [2, 3, 4, 3] and k = 1, the range of the 3 at index 2(assuming 1 based indexing) is [1, 3] and that for the 3 at index 4 is [3, 4]. But both of these can be mapped to different values. So, to check whether the ranges for 2 indices i and j (i < j) for a given value of k overlap or not, we need to check j  i <= k, and not i + k <= j  k. In the example above, the optimal compression for k = 2 is [1, 2, 3, 1], while using the wrong condition it comes out to be [1, 2, 3, 2]. My solution with the wrong condition passed all but the last test file. answered 18 Aug '18, 12:09

@admin, @likecs, is anything going to happen here? I do not see how this:
can be derived from problem statement... Moreover, I bet, during AUG18 most people who submitted greedy algo and got AC  they didn't question their own solution(they got AC, you know) and just moved on to next problems. I just picked and tested some of AC solutions from top 30 aug18 finishers, all they look wrong and fail on these 'counterexamples'  you can easily check it yourselves using following input:
and if you get following response from any solution:
it is wrong answer. I don't think how this whole situation is ok. Besides to @acmonster and @buda have already pointed to flaws in editorial solution. I as well developed greedy algorithm at first during the contest but on first WA(bug in implementation), after more thorough thinking, I discovered similar 'counterexamples' myself. So that essentially made problem harder obviously and I 'postponed' it(and it turned out I never returned back to it during the rest of contest). IMHO, either this problem should be removed from every contestant's score or entire aug18long should be unrated. Because It feels like bad precedent. Am I missing any specific rules for such cases? PS: how many people doe usually verify problem statements and solutions? Looks like this was missed even by 'tester'. answered 19 Aug '18, 18:55
@koyaaniqatsi, you are correct. I have already mentioned in the editorial that the test case for small dataset was weak and the english statement was horrible. About 4 different people tested their solution during the preparation, but everyone got the same logic and everything looked ok then. Regarding question being removed or contest being unrated, please contact @admin or @mgch.
(19 Aug '18, 19:42)

@acmonster you correct that the current statement doesn't have the proof for binary search. Even the method for finding kcompressed array will be wrong. The framing of the english statement based on my idea for the question couldn't be framed correctly by me. As per your comments (which are posted in the editorial), the correct statement should have been: "B should preserve all the relative sizes for each index pair (i, j) such that i  j <= K. In other words, if A[i] < A[j], we should have B[i] < B[j]; if A[i] = A[j], we should have B[i] = B[j], etc" If this, the editorial is exactly in lines with what you said. I apologise if someone faced the similar situation during the contest. answered 19 Aug '18, 19:37

I implemented a brute force implementation during the contest which was almost same as the basic idea mentioned in the editorial i.e. updating number at given index and finding min in a given range. I was targeting the first two subtasks at that moment. However, even after spending quite a bit time on it, it gave me selectively wrong answer. I am posting my last solution here. Any help will be appreciated. Thanks. Here's my submission: https://www.codechef.com/viewsolution/19654286 answered 23 Aug '18, 01:20

can anyone explain what's wrong with following codes? I have checked codes many times but I could not figure out what causes my codes to causing SIGSEGV error. link to my solution  https://www.codechef.com/viewsolution/19877450 answered 26 Aug '18, 08:26

Can anyone please explain the fault in my code...I have given this question a lot of time still no improvment ; answered 26 Aug '18, 22:12

@likecs could you help?? Can anybody plz tell why I am getting wrong answer in the very last test case ??? I am stuck at it for too long !! Any help will be appreciated .... Thanks in advance ! answered 02 Oct '18, 03:35

That thing with balancing the similar numbers took too much time for me to solve finally and come above all those TLEs and WAs, because of some strong test cases in the last subtask. But, really a great question in the first place. Worth it!