Past kickstart problem

I saw a few submissions at the top of the leaderboard and I don’t think they are using divide and conquer. My solution is quite similar to these and the only things used are prefix sums and binary search.

Let us call the input array A and let S_{i,j} = \sum_{k=i}^j A_k. Now consider all the subarray sums:
S_{1,1}, S_{1,2}, S_{1,3}, \cdots , S_{1,N} \\ S_{2,2}, S_{2,3}, \cdots, S_{2,N} \\ \vdots \\ S_{N-1,N-1}, S_{N-1,N} \\ S_{N,N}

When arranged in this particular order, observe that each row is a sorted sequence (there are no negative numbers in A).

How does this help? If you construct the prefix sum array you can access all these sums in constant time, so you effectively have these N sorted sequences ready for use.
Now given a value x you can binary search on these N sequences to count how many elements in each sequence are \le x. If you add these counts up, you can get the total number of sums S_{i,j} which are \le x.

Next you can binary search on x to find the element at index k in the sorted sequence of all S_{i,j}.
And if you know this element at k, you can also find the sum of all S_{i,j} which are \le this element using a prefix sum array of the initial prefix sum array.

Now the final step is just to compute this sum for k=R and k=L-1 separately and print the difference. The total complexity is \mathcal{O}(Q \cdot N \cdot \log{N} \cdot \log{100N}), where the 100 is the limit on the maximum value of an element of A.

3 Likes

Since the sum of every subarray is just the difference of two prefix sums, we can create two polynomials corresponding to the prefix sums and their negative. Let the prefix sums of the array be s1,s2,…sn.

P1 = x^s1 + x^s2 + … + x^sn

P2 = 1 + x^(-s1) + x^(-s2) + … + x^(-sn) (here,the term 1 is for the empty prefix)

The coefficient of x^i in P1*P2 will be the number of subarrays having sum ‘i’. However, since we can’t deal with negative powers, we will multiply p2 with x^S (S>=sn). Now, the coefficient of x^i will represent the number of subarrays having sum i-S. We will only take the coefficients corresponding to positive values ie. i-S >0 or i>S. We can store these counts in an array and binary search for each query.

The complexity of this solution will be O(AlogA + QLogA) where A = 100*N.

5 Likes

Can you provide a link to any solution that uses divide and conquer? I know a way to solve this but in a different way.

@meoooow If possible can u explain your method?

@meooow u can see the solution in the rank list !

Thanks @vivek_1998299 for the explanation i dont understand the fft part “Do their multiplication,the coefficient represents the occurrence.(just keep a global frequecy map and update accordingly)”…

See the possible sums should be (5+3),(5+9),(4+3),(4+9)

Now (x^5 + x^4)*(x^3 + x^9)= x^(5+3) + x^(5+9) + x^(4+3) + x^(4+9)

As u can see coefficents of multiplication is 1 for all terms,ie (5+3),(5+9),(4+3),(4+9) occured once.
(The coefficient of x^y represents the number of occurence of y)

Now this multiplication can done faster using fft.

@vivek_1998299 but after creating N*(N+1) its sorted will this work for sorted array(N*(N+1)) too?

Yeah ,the whole reason for doing this divide and conquer and fft was to create a frequency map of sums of n*(n+1)/2 subarrays.If u see then only n*100 values can be possible.After constructing frequency map We can then easily answer any query.

I mean lets say frequncy map is:

1 - 2 times
3 - 1 time

I want [1,3] which is 1 * 2 + 3 * 1 ,(i mean O(n * 100))

(Or rather preprocess to answer in O(logn) )

@vivek_1998299 can you explain your approach a bit more? Also if you can share some working code that would be great.

Ohk,so put forward simply,first we calculate sum of all subarrays[l,r] whose l lies in left ,r lies in right part there are about n^2/4 such subarrays.Now the only subaarays that remain are the one whose [l,r] both lies in left part ,or both lies in right part.So we could do this recursively over left and right part.And for a particular level of recursion tree the complexity is O(nlogn) (for fft merging),there are logN levels so total complexity O(nlogn^2)

The recurrence relation is T(n)=2T(n/2)+O(nlogn)

And if you know this element at k, you can also find the sum of all Si,j which are ≤ this element using a prefix sum array of the initial prefix sum array

Can you explain this further?

That is a neat solution, probably the intended one.

Let the prefix sum array of A be P, and the prefix sum array of P be Q. And you want to find sum of all S_{i,j} \le x.
For the i^{th} of the N sequences above, binary search to find the last index j such that S_{i,j} = x. Then \sum_{k=i}^j S_{i,k} = Q_j - Q_{i-1} - P_{i-1} \cdot (j - i + 1). It should make sense somewhat intuitively, but if it doesn’t you can write each term as a summation of elements of P and the math will work out :smiley:

Or maybe not… you can handle much larger number of queries with this approach.

Probably a much better solution than intended :slight_smile:

Yeah that makes sense. I initially thought you meant that we can do it using a linear scan or something. silly me -_-;

@meooow’s solution with two pointers instead of binary search (for counting subarrays with sum <=x) will be faster, for Q=20. That might have been the intended solution.