FARASA - Editorial

PROBLEM LINK

Practice
Contest

DIFFICULTY

HARD

PREREQUISITES

Fast Fourier Transform

PROBLEM

You are given an array of N positive integers. There are N+1C2 sub-arrays of this array. Find the number of unique sums possible among all the possible sub-arrays.

QUICK EXPLANATION

Although the problem statement isn’t as direct as the problem description used above, it is very easy to come to the conclusion that this is indeed what the problem to be solved is.

The limits to the problem were given rather innovatively as

1 ≤ N * S ≤ 4 * 1010

Where S is the sum of all the numbers.

Considering all the numbers are positive, the real limit on N is hence

1 ≤ N ≤ 200,000

We note that S can vary greatly as N varies and hence we cannot solve this problem for all values of N under the same algorithm. We break our solution into 3 different methods meant to solve the problem for different values of N.

EXPLANATION

1 ≤ N ≤ 2,000

For this limit of N

  • The number of sub-arrays to be considered are at most 2,001,000
  • The value of the sums can be at most 40,000,000,000 (for N = 1)

We can generate all the sums and eliminate the duplicates. We cannot eliminate them using a lookup table because of the possibly large value for the sums.

The complexity of this procedure would be O(N2 log N). The complexity is driven by the procedure to sort before we eliminate the duplicates.

sums = []

for i = 1 to N
    value = 0
    for j = i to N
        value += A[j]
        sums.push(value)

sort(sums)
eliminate-duplicates(sums)

2,000 < N ≤ 20,000

For this limit of N

  • The number of sub-arrays to be considered are at most 200,010,000
  • The value of the sums can be at most 20,000,000 (for N = 2000)

We can still generate all the sums and eliminate the duplicates. Note that sorting to eliminate the duplicates can be quite expensive since we are potentially considering over hundred million values. Fortunately, the value of sums has come down within the range to maintain a lookup table!

sums[] = { 0 }

for i = 1 to N
    value = 0
    for j = i to N
        value += A[j]
        sums[value] = 1

answer = 0
for i = 1 to MAX_SUM
    answer += sums[i]

The lookup table helped us avoid the cost of the sorting step to eliminate duplicates. We still relied upon our ability to iterate over every possible sub-array’s sum. The complexity of this procedure is O(N2 + S).

20,000 < N ≤ 200,000

For this value of N

  • The number of sub-arrays to be considered are at most 20,000,100,000
  • The value of the sums can be at most 2,000,000 (for N = 20000)

We can no longer iterate over each sub-array because there could be too many of them. But we note that the value of the maximum possible sum is now very manageable. We can use this to our advantage by trying to base our approach completely over the possible values for sums.

Let us consider an array of sums of prefixes

S[i] = sum( A[x] : 1 ≤ x ≤ i )

We wish to find all the possible values generated by

S[j] - S[i]

for each 1 ≤ j ≤ N AND 1 ≤ i < j. This convolution can be generated by the product of the following two polynomials

X = x0 + xS[1] + xS[2] + ... xS[N]
Y = x0 + x-S[1] + x-S[2] + ... x-S[N]

The number of terms in the product of X and Y, which have non-zero coefficient and positive power, is the answer we are looking for. To only deal with positive powers in our problem, we can multiply Y with xS[N] and check the coefficients of all the terms xS[N]+1 and above in th resultant polynomial.

The fastest way to multiply two polynomials is using FFT. We can multiply the two polynomials as follows.

fX = FFT(X)
fY = FFT(Y)

fT = fX . fY (dot product)

T = iFFT(fT)

Thus, we can solve this case in O(N + S log S) time.

SETTER’S SOLUTION

Can be found here.

TESTER’S SOLUTION

Can be found here.

6 Likes

Nice FFT-based solution. After reading the editorial I googled a bit and it seems that the main idea is also presented in this paper http://www.sciencedirect.com/science/article/pii/S0166218X06003672 Thus, there was a chance that someone could have found the solution idea on the Internet, which I don’t think is desirable, particularly for a problem which seems to have been intended to be a HARD problem. But, anyway, my complaint would be that the test cases were not strong enough (i.e. they did not fail all the inefficient solutions). I will just give as an example my own solution (and I’ve seen a few others based on similar ideas). Let’s say that we are in case 3 presented in the editorial, where S is at most 2000000. In my solution I considered every sum X from 1 to S and searched it in the array (starting from the first prefix sum which is greater than or equal to X) in O(N) time (until I either find X or until I reach the end of the array - I used the “two pointers” idea). This solution works very well when most of the S sums can be found as the sum of some subarray. However, this solution would be very slow on the following test case: N=20000 and all the values in the array are equal to 100. We get S=2000000, but only N out of the S sums appear as subarray sums (i.e. a very small fraction of the total number of possible sums). Of course, this particular test case (and other similar ones) could have been verified separately, but the point is that I think that test cases in which the answer is small compared to S and S and N are both sufficiently large are missing. Anyway, I’m not complaining for getting AC :slight_smile: (and once I got the AC there was no incentive to look for more efficient solutions) but I thought that this point was worth mentioning. I am wondering if anyone has other solutions with a guaranteed theoretical time complexity which is good enough to fit the time limit.

10 Likes

I think that you can skip FFT(Y) step since its coefficients are essentially conjugates of FFT(X)

3 Likes

Isnt the complexity O(N + N log N) because each polynomial has N terms.

The link to the paper is very interesting. I didn’t expect that.

The problem’s author initially intended an O(N*S / 32) solution for this problem. I came up with a much more complicated solution dividing the string right at the middle, and using FFT to add up the sub-arrays that extend through the middle.

I noticed this simple and elegant solution (described in the editorial) in some submissions and hence chose to describe this solution instead :slight_smile:

As for the test cases, we kept trying to build cases where my solution would perform badly. If I knew about the approach with single polynomial multiplication earlier, I would have concentrated on other ways to solve :frowning: I will sneak in some test data for practice section though!

2 Likes

As mugurelionut I also haven’t figured out the FFT thing, even if I thought about multiplication similarity.
For me the second case described in the tutorial worked up to n <= 50000. For the larger testcases I noticed, that if there is a field of x ones near the beginning and after that there is no number greater than x + 1, than all numbers from 1 to the sum till the end of the array can be made. In all large testcases, it was the case that this field appeared very soon - we only need the check the rest of the field in quadratic time - AC!
testcases with repeated 1, 3 would brake it.

I implemented the described FFT algorithm for N >= 8000 in O(S log S), and for N < 8000 I used bucket sort to sort the subarray sums. Unfortunately though, my implementation wasn’t good enough (time ~11 sec). But it was fast enough to pass this problem. CodeChef: Practical coding for everyone

1 Like

Greetings.

I implemented the solution using FFT, but for some testcases that I’ve randomly generated, my code is running way below expectations (up to 10secs!!). The one thing that is bugging me is that my implementation should run in O(n log n) and is based on the implementation on https://cp-algorithms.com/, and I’ve no idea what’s going wrong. Can you please help me out a bit ?

Thanks in advance.

How to do the same question under big prime mod and sum of array can be very big .

Your code runs absolutely fine under 2 seconds.