ARRAYSUM - Editorial

pushkarmishra · March 24, 2016, 4:03pm

PROBLEM LINK:

Practice
Contest

Author: Sergey Kulik
Tester: Vlad Mosko
Editorialist: Pushkar Mishra

DIFFICULTY:

Medium-Hard

PREREQUISITES:

DP, Data Structures

PROBLEM:

Given two arrays A and B of lengths N and M (M \leq N) respectively, output the number of subsequences C of A which are of length M such that C[i]+B[i] for i = 1..M forms a non-decreasing subsequence. Note that all arrays are 1-indexed.

EXPLANATION:

Subtask 1
For this case, we are given that N = 50 and M = 5, we can iterate over all possible subsequences of A of length 5 and check which ones form a non-decreasing sequence.

The complexity of this approach is definitely exponential. We can iterate over all the possible subsequences of A of length 5 using recursion. Since, the number of 5 length subsequences can be {N choose M} (counting the number of combinations, the maximum number of operations required would be {50 \choose 5} = 2118760. This will definitely run under the given time constraints.

To avoid recursion, we can simply take an array of length N, fill it up with M 1s and rest 0s, and cycle through its unique permutations using either an inbuilt function like next\_permutation in C++, or a custom function. Wherever 1s appear in a permutation, those elements can be taken as C. We can then check for each permutation whether C[i]+B[i] forms a non-decreasing sequence or not, while keeping a count.

Complexity of this approach is \mathcal{O}(N!).

Subtask 2
For this subtask, we have that N can be as large as 500 and M can be as large as 50. Clearly, the previous approach will not work.

Most of the counting problems can be solved using Dynamic Programming. Below, we formulate the DP.

Let us say that DP[i][j] denotes the number of possible non-decreasing subsequences such that A[i] is taken as the j^{th} number of C. Then we can formulate the recurrence in the following manner:

The base case of the recurrence is clearly DP[0][0] = 1.

DP[i][j] = summation of DP[k][j-1] for all k = 0 to i-1 such that A[k] + B[j-1] \leq A[i] + B[j]. Note that A[0] and B[0] are set to arbitrary negative values so that they are definitely less than anything else in the recurrence.

The final answer is summation of DP[k][M] for k = 1 to N.

The pseudocode for the DP is as follows:

//Setting base conditions.
DP[0][0] = 1;
A[0] = B[0] = -inf;

for i = 1 to N
{   
    for j = 1 to M
    {
    	DP[i][j] = 0;

    	for k = 0 to i-1
    	{
    		//Checking for non-decreasing criteria
    		//and accumulating.
    		if(A[k] + B[j-1] <= A[i] + B[j])
    			DP[i][j] = DP[i][j] + DP[k][j-1];
    	}
    }
}

final_answer = 0;
for i = 1 to N
{
	final_answer = final_answer + DP[i][M];
}

output final_answer;

The complexity of this program is clearly \mathcal{O}(N^2M). For the given constraints, this works well.

Subtask 3
In this subtask, N increases to 2000 and M to 1000. We can’t do the same dynamic programming as before. We need to somehow optimise the \mathcal{O}(N^2M). The step that we can optimise is summing the DP[k][j-1] for all k = 0 to i-1 such that A[k] + B[j-1] \leq A[i] + B[j]. How can we optimise this?

Segment trees! We can keep M segment trees. These segment trees will be used for adding values at particular points in the range 1..2*10^9 and performing range sum queries. The j^{th} segment tree adds the value DP[i][j] to the leaf for the point (A[i] + B[j]). But since (A[i]+B[j]) can be as large as 2*10^9, we can either do a coordinate compression or construct dynamic segment trees. Editorialist’s program uses dynamic segment trees.

Here is a pseudocode for the same:

//Setting base conditions.
A[0] = B[0] = -inf;
DP[0][0] = 1;

//We update the 0th segment tree too.
//The first argument states the tree;
//the second argument gives the point at which
//we need to update. Third argument gives the value
//to be added.
update(0th tree, 0, DP[0][0]);

for i = 1 to N
{   
    for j = M down to 1
    {
    	//first argument gives the tree,
    	//second and third specify the range.
		DP[i][j] = query((j-1)th tree, 0, A[i] + B[j]);

		//Now, we update the jth tree with the value.
		update(jth, A[i]+B[j], DP[i][j]);
	}
}

final_answer = 0;
for i = 1 to N
{
	final_answer = final_answer + DP[i][M];
}

output final_answer;

We have reduced one factor of N down to \log (A_{max}+B_{max}). Therefore, the complexity for this algorithm is \mathcal{O}(NM \log A_{max}).

Editorialist’s program follows the editorial. Please see for implementation details.

Setter uses a different approach to this problem. Setter’s solution uses DP, but a slightly different recurrence. That recurrence allows for the use of Binary Indexed Trees to be employed, thus allowing for a lesser constant of complexity.

OPTIMAL COMPLEXITY:

\mathcal{O}(NM \log A_{max})

SAMPLE SOLUTIONS:

Author
Tester
Editorialist

sanket407 · March 27, 2016, 11:14am

""The final answer is summation of DP[k][M] for k=1 to M “”
There is typo It should be 1 to N instead of 1 to M

prudvinit · March 27, 2016, 12:03pm

Can someone tell me in detail how to optimize the innermost loop, i.e sigma(dp[k][i-1]) satisfying the condition . How does segment tree helps in the optimization of this?

sanket407 · March 27, 2016, 5:28pm

@pushkarmishra How are we supposed to do coordinate compression ? I am RE NZEC if i create M seg trees each of 4*n*m nodes(for n*m a[i]+b[j]'s) .I am guessing out of memory error is occuring . .
My soln : CodeChef: Practical coding for everyone
Note that i have only stored distinct a[i]+b[j] values in seg trees

nibnalin · March 26, 2016, 10:39pm

It is full of "Math processing error"s, please correct Latex

nibnalin · March 26, 2016, 10:43pm

In the pseudocode of subtask 3, we make m segment trees, and as after coordinate compression, unique values can be nm, wouldn’t this lead to memory complexity of O(n*m^2)(nm^2)? That would be 2.10^9, how can that work?

pushkarmishra · March 26, 2016, 10:59pm

Author has used coordinate compression. requires some thinking. its just implementation detail by which you can cut on memory. please see author’s solution if/when available.

pushkarmishra · March 27, 2016, 7:02pm

Corrected! Thanks

nibnalin · March 27, 2016, 7:05pm

Yes, this is what I was talking about in the comment I made, this is because of memory overflow, and to fix it is a neat way of processing the values offline. The basic idea is to process them in decreasing value of m, and then sort the array A by pair of (A[i], i) and find lower bound in the previous BIT, to get memory complexity of O(n\cdot m)

pushkarmishra · March 28, 2016, 11:39am

The author’s solution uses coordinate compression. He doesn’t simply implement coordinate compression in the above algorithm since it will lead to memory shortage. Instead for each element B[i] in B, he does a coordinate compression of B[i] + A[k] for k = 1 to N. This way, he only compresses N elements each time. The recurrence has to be slightly modified for this method. The new recurrence is clearly stated in the author’s solution. Please see for details.

sanket407 · March 30, 2016, 6:49pm

Thanks.
Got it from authors soln