Problem Link
Author: Bhuvnesh Jain
Tester: Mark Mikhno
Editorialist: Bhuvnesh Jain
Difficulty
MEDIUM
Prerequisites
Binary Search, Greedy, Segment trees.
Problem
You are given a sequence of integers A_1, A_2, \dots, A_N. For an integer K, let’s define a K-compressed sequence B_1, B_2, \dots, B_N as follows:
- for each valid i, B_i is a positive integer.
- for each valid i, if there are exactly X numbers smaller than or equal to A_i in the subsequence A_{\mathop{max}(1, i-K)}, \dots, A_{\mathop{min}(N, i+K)}, then there must be exactly X numbers smaller than or equal to B_i in the subsequence B_{\mathop{max}(1, i-K)}, \dots, B_{\mathop{min}(N, i+K)}.
- B_1 + B_2 + \dots + B_N is minimum possible.
For a given integer S, find the number of values of K (0 \le K \le N) such that the sum of elements of the K-compressed sequence does not exceed S.
Explanation
Let us first understand how to construct a K-compressed array B for given array A and value K. For this, we will iterate over the numbers in increasing order and try to assign the smallest number not yet assigned to any number smaller than it until now. This will be optimal as we try to assign the smallest possible number to each number and iterating in increasing order ensures that if the number assigned to a given number can’t be decreased further, the sum can’t be minimised further. This proves our greedy strategy. Let me explain it through an example, where A = [4 2 8 1 4 3 8 1] and K = 3.
Let the current array B be [-, -, -, -, -, -, -, -]. We now group the numbers having the same value and iterate in increasing order. The following are the values of the array B after each step:
-
[-, -, -, 1, -, -, -, 1]. Since both 1 are the first number to be considered, we assign them the smallest possible number in B, i.e. 1.
-
[-, 2, -, 1, -, -, -, 1]. Since, index 2 has number 1 less than it in range [max(1, 2-3), min(8, 2+3)] = min[1, 5]. We assign it the next biggest number.
-
[-, 2, -, 1, -, 2, -, 1]. Since, index 6 has number 1 less than it in range [3, 8], we assign it next highest number not present in this range i.e. 2. Note that though A[2] < A[6], we can still have B[2] = B[6] as there ranges do not coincide to give a conflict.
-
[3, 2, -, 1, 3, 2, -, 1]. For both index 1 and 5, we have it as the third largest number in their respective range. So, we assign them the next biggest number which was not used.
-
[3, 2, 4, 1, 3, 2, 4, 1]. This is similar to the above process.
Thus, the minimum possible sum is (3+2+4+1+3+2+4+1) = 20. Make sure you are clear with the idea before you proceed further.
Now there are some issues which might occur while implementing the above approach. The first thing is that we should deal with all the numbers having the same value together should be dealt together instead of simply iterating in increasing order. One simple counterexample for this is the array A = [10, 30, 30, 20, 10] and K = 1. The compressed array should be B = [1, 3, 3, 2, 1] itself but if we simply iterate over numbers in increasing order, we might end up getting array B as [1, 2, 3, 2, 1] or [1, 2, 2, 2, 1] which is again highly dependent on your implementation.
The correct logic is to first group the numbers by their values. Find what value you might end up giving them based on their ranges. Then, we need to be sure that the value we might give is correct or now. For this, we again iterate over the numbers and group them if their ranges coincide with each other. We assign it the all the numbers in the group the largest number we thought of assigning to any number in the range. For the example A = [10, 30, 30, 20, 10] and K = 1, the process is as follows:
-
The initial array is [-, -, -, -, -].
-
[1, -, -, -, 1]. We group 10 first. The ranges for them are [1, 2] and [4, 5]. Since they do not conflict, we assign them separately the minimum number to them. Both of them end up getting 1.
-
[1, -, -, 2, 1]. Since there is only one 20, we simply assign it the smallest number in the range [3, 5] which is not assigned yet, i.e. 2.
-
[1, 3, 3, 2, 1]. The ranges for 30 are [1, 3] and [2, 4]. Since they both coincide, we will assign them value together. The value we might end up giving A[2] is 2 since it is the smallest which is not used till now in the range [1, 3] in B. The value we might end up giving A[3] is 3, which is the smallest not used till now in the range [2, 4] for B. Since the ranges coincide, we should give each of them 3, the maximum of the number we might think of assigning them.
Thus, we can easily build the K-compressed algorithm using the above ideas. But how fast can we do it?
Doing it naively will take O(N^2) as it will require you to find the smallest number not assigned yet in a range, taking O(N) for this step alone. But if you restate this problem, it is similar to the following 2 operations:
- Update the number at given index.
- Find the largest number in a range.
This is a very familiar problem which can be solved using segment trees in O(\log{N}) for each operation. You can read about it here.
Thus, we can built a K-compressed array for given array A and K in O(N * \log{N}) complexity.
To find what possible values of K will lead to compressed array having sum less than S, we can simply iterate over all possible values of K and update the answer. This approach has a complexity of O(N * N * \log{N}) which is enough to pass the first 2 subtasks.
The next thing to note that we can binary search on the answer. To prove this, we need to prove that the minimum sum for K-compressed array does not decrease with increasing K. This again relies on the way we build our K-compressed array using a greeedy algorithm. Since we make sure that each number is compressed to the smallest possible value and the sum can’t decrease with increasing K as the smallest number which might get assigned to a number can only increase if it’s range (or window) increases.
Thus, the overall complexity of the approach reduces to O(N * {\log}^2{N}). This is enough to pass all the subtasks as well.
Once, you are clear with the above idea, you can see the author implementation below for help.
Feel free to share your approach as well, if it was somewhat different.
Note from Author:
The test case data was bit weak for the small subtasks where some wrong greedy approaches passed as well. Though the large test case ensured, wrong solutions could not pass for the full solution, but still, I failed in generating stronger test data for smaller ones. This is not completely our fault as there are many ways the greedy solution for this compression could be written where even index errors could happen which implementing. I would like to thank all the people who testing their solutions as well and helped me strengthen the test as well (Special thanks to Stepan). But still, we alone cannot come up with all possible greedy solution which one might write, so stronger test could not be prepared by me. By the time, I came to know about this it was already to late for a rejudge, but I will take care of it in the future problem.
Also, the problem statement seemed quite tough to comprehend for most of the contestants as well. User acmonster even pointed out a flaw in the English statement as well. Below were his comments:
“I guess that the problem (and test data) actually requires that sequence B should preserve all the relative sizes for each index pair (i, j) such that |i - j| <= K. In other words, if A[i] < A[j], we should have B[i] < B[j]; if A[i] = A[j], we should have B[i] = B[j], etc.
If this is correct, the fix does not seem to be sufficient. Consider A = (1, 1, 1, 2, 3), B = (2, 1, 1, 1, 2), and K = 2: both sequences has a “signature” of (2, 2, 2, 2, 2), but, again, the optimal sequence B does not preserve A[1] = A[2] = A[3] < A[4] < A[5].”
Though most of the contestants got what the problem statement meant, I will make sure to make better statements in future as well.
Time Complexity
O(N * {\log}^2{N})
Space Complexity
O(N)
SOLUTIONS:
Author’s solution can be found here.
Tester’s solution can be found here.
Editorialist’s solution can be found here.