COUNTARI - Editorial

gamabunta · November 12, 2012, 9:00pm

PROBLEM LINK

DIFFICULTY

HARD

PREREQUISITES

Dynamic Programming, Discrete Fourier Transform, Fast Fourier Transform

PROBLEM

You are given a list of integers between 1 and 30000, inclusive. You have to find triplets i < j < k in A[], such that A[i], A[j] and A[k] are in Arithmetic Progression.

QUICK EXPLANATION

The trivial algorithm to solve this problem is O(N³). This is of course too slow for N = 100,000.

Let us consider the O(N²) algorithm below to solve this problem. It will also be slow, but it gives us ideas that can lead to a solution.

prev[1 ... 30000] = {0} // we will maintain frequencies here
result = 0
for j = 1 to N                  // fix j
    for k = j+1 to N                // fix k
        Ai = 2*A[j] - A[k]              // the value at i
        result += prev[Ai]              // add the number of choices for i
    prev[ A[j] ]++

Of course this is too slow.

EXPLANATION

Let us consider an even slower algorithm. The reason to consider this algorithm, is because it introduces a very powerful idea to solve the problem. Once we understand how that works, we will talk about how we can solve the problem.

prev[1 ... 30000] = {0} // we will maintain frequencies here
next[1 ... 30000] = {0} // we will mataiain frequencies here too! This time,
                        // we maintain frequencies of occurance 'after' j
for v = 1 to N
    next[ A[v] ]++

for j = 1 to N
    next[ A[j] ]--
    cnt[1 ... 60000] = {0}
    for Ai = 1 to 30000
        for Ak = 1 to 30000
            cnt[Ai + Ak] += prev[Ai] * next[Ak]
    result += cnt[ 2*A[j] ]
    prev[ A[j] ]++

Notice how we maintain cnt. This is actually equivalent to multiplication of two polynomials!

Suppose we had two polynomials

P = p₁x + p₂x² + p₃x³ … + p₃₀₀₀₀x³⁰⁰⁰⁰
Q = q₁x + q₂x² + q₃x³ … + q₃₀₀₀₀x³⁰⁰⁰⁰

The product of these two polynomials R, can be calculated as follows

for i = 1 to 30000
    for j = 1 to 30000
        R[i+j] = P[i] * Q[j]

We can optimize the product of the two polynomials by using Discrete Fourier Transforms!

fP = DFT(P)
fQ = DFT(Q)
fR = {0}
for i = 1 to 2¹⁶
    fR[i] = fP[i] * fQ[i]
R = iDFT(fR)

Where, fR is the Discrete Fourier Transform of R, and iDFT is the inverse Discrete Fourier Transform method.

Discrete Fourier Transforms, and inverse Discrete Fourier Transforms of an array of size N can be calculated in O(N log N) time using the Fast Fourier Transform method! The link to the same has been provided in the PREREQUISITES section above.

Hence we can modify our O(N*30000²) algorithm into a O(N*30000*16) algorithm.

This, is still too slow to solve the problem. Now we introduce the idea that solves the problem.

Optimization using Blocks

Suppose we split the entire list into K blocks each of size N/K.

prev[] represents the frequency of the numbers in blocks before the current block.
next[] represents the frequency of the numbers in blocks after the current block.
inside[] represents the frequency of numbers within the current block.

We have three cases to consider while choosing triplets.

The entire triplet lies in the current block
- This can be calculated in O(N²/K²) time similar to the first O(N²) algorithm we described. We will be using the inside array for the frequencies.
Two numbers in the triplet lie in the current block
- This can also be calculated in O(N²/K²) time.
- We use the first O(N²) algorithm. We just use prev / next instead of inside.
Only one number lies in the current block
- We find the product of the two polynomials prev and next using FFT.
- This step takes O(30000*16) calculations.

This the overall complexity of the algorithm would be

O(K*30000*16 + N*N/K)

You can experiment and choose an optimal K. The tester’s solution chooses K = 30, the setter’s solution chooses K = 40.

This will fit within the time limit.

SETTERS SOLUTION

Can be found here.

TESTERS SOLUTION

Can be found here.

anton_lunyov · November 12, 2012, 11:26pm

Without the condition i < j < k the problem can be solved in O(N + V * log V) time, where V = 30000, using just one multiplication of polynomials. We simply should take the square of the polynomial with coefficients equal to cnt[x] which is the frequency of number x in the array.

It was clear for me immediately after reading the problem since this version has appeared several times on different online judges. But condition i < j < k stops me for a while. I was still believing that using FFT is possible (and it seems true as editorial shows :)). But after several days of useless tries to apply FFT here I just decided to submit some optimized version of naive O(N * V) approach that is obvious modification of O(N * V * V) draft solution from the editorial:

prev[1 ... 30000] = {0}
next[1 ... 30000] = {0}
for v = 1 to N
    next[ A[v] ]++

for j = 1 to N
    next[ A[j] ]--
    cnt = 0 // it is cnt[ 2 * A[j] ] from the editorial
    for Ai = 1 to min(30000, 2 * A[j] - 1)
       cnt += prev[Ai] * next[2 * A[j] - Ai]
    result += cnt
    prev[ A[j] ]++

But I was absolutely sure that this should get TLE. (Now I see that even this solution can pass: proof).

IMHO the main bottleneck is the multiplication operation in the loop. So I decided instead of calculating cnt each time, update the whole array cnt[] using only additions and subtractions. Indeed, when we made next[A[j]]– we should decrease cnt[(x+A[j])/2] by prev[x] for each x of the same parity as A[j] (follows from the line cnt[Ai + Ak] += prev[Ai] * next[Ak] in the editorial). The similar things happens when prev[A[j]]++ happens.

At first sight, I thought that values cnt[] do not fit in 32bit integer type and use long long for them and this cause TLE of course (I denote prev as pref, next as suf and cnt as conv (due to “convolution”) in my solution). Then I decide to check tests for weakness and use int. And this cause WA (so test data are not weak). Finally I have thought on the bound for cnt[x] more carefully and realized that the maximum value is 50000² which fits in 32bit unsigned integer type. Hence I change int to unsigned int and surprisingly get AC without any further optimizations! And it is really weird since on any test with N = 100000 my solution performs exactly 3 * 10⁹ not very trivial blocks of several atomic operations.

In fact, I see now (by examining all 37 solutions) that almost all contestants (UPD 27 out of 37 with one having FFT in code but not using it :)) solved the problem without FFT and what is very curious that almost all top solutions do not use FFT. Well, my solution appears to be the slowest among all C++ solutions

betlista · November 13, 2012, 2:05am

Hi all,

I’m lost in quick explanation already :-/

prev[1 ... 30000] = {0} // we will maintain frequencies here
result = 0
for j = 1 to N                  // fix j
    for k = j+1 to N                // fix k
        Ai = 2*A[j] - A[k]              // the value at i
        result += prev[Ai]              // add the number of choices for i
    prev[ A[j] ]++

how that works for A = { 1, 3, 5 } ?

j = 1
k = 2
Ai = 2*1 - 3 = -1 ?!?

and how is prev initialized?

kuruma · November 13, 2012, 4:49am

This solution is probably one of the best solutions I’ve ever read…

Thank you very much for the insights and tricks for our bag that every single long CC contest provides us…

Hopefully, not so late, I might be on the “other side” as well!!!

Fantastic community and awesome contests!!

PS: thanks to NOV12 I learned some new data structures that allowed me to get an AC solution during a live contest and even though I only solved 2 problems I can’t wait for the next contests

Thank you all,
Bruno

kuruma · November 13, 2012, 4:58am

Also, can someone provide me an explanation of the line:

Ai = 2*A[j] - A[k]

on the O(N²) solution?

I am missing something :((

wjmzbmr · November 13, 2012, 12:54pm

But many brute-force solutions pass this problem, for a simple operation like iterate through two array and found their dot product, if we use 16 point each pointer to a position 16*x+i, and let them do their work in a single loop. then the modern cpu can run it VERY fast. FFT’s constant is very huge so it even run slower then that kind of brute-force :(.

prasun72894 · November 6, 2018, 3:17pm

Hi ,

I wanted to know how in the below code the variable i is initialized / defined

prev[1 … 30000] = {0} // we will maintain frequencies here
result = 0
for j = 1 to N // fix j
for k = j+1 to N // fix k
Ai = 2*A[j] - A[k] // the value at i
result += prev[Ai] // add the number of choices for i
prev[ A[j] ]++

ayushpandia · November 12, 2018, 2:57pm

import java.io.*;
class A_P
{
public static void main(String args[])throws IOException
{
try{ BufferedReader br=new BufferedReader(new InputStreamReader(System.in));
int f,c=0,s,z,n=Integer.parseInt(br.readLine());
String s1=br.readLine();
String w[]=s1.split(" ");int a[]=new int[n];
for(int i=0;i<n;i++)
a[i]=Integer.parseInt(w[i]);

 for(int i=0;i<n-2;i++)
 {
     for(z=i+2;z<n;z++)
     {
         s=a[i]+a[z];
         if(s%2==0)
         {   f=pres(i+1,z-1,s/2,a);
             if(f!=0)
             c+=f;
          }
        }
    }
    System.out.println(c);
}catch(Exception e){return;}

}

public static int pres(int l,int u,int n,int a[])
{ int f=0;
for(int i=l;i<=u;i++)
{
if(a[i]==n)f++;
}
return f;
}
}

mukulgupta · November 12, 2012, 11:38pm

“@ kunalgaurav18 Because of the condition i < j < k. Well, for me this condition ruins very beautiful solution. Currently to handle this condition I have very bad solution that pass the TL which were very surprisingly for me.”

If it was not for your comment, I would never have tried my slow O(NM) at the first place. Thanks

anton_lunyov · November 13, 2012, 12:01am

Hm… I knew that I shouldn’t write this comment. Admins should bun me for ages because of it

betlista · November 13, 2012, 1:59am

@anton_lunyov You are telling that without i < j < k condition, you can find all triplets easily? First of all I have no idea how. And if I have such count C, isn’t answer to this question C/3! ?

gamabunta · November 13, 2012, 2:21am

Since prev[-1] = 0 (the index is out of bounds), the result will not be incremented.

When j = 2, and k = 3, Ai = 2*3 - 5 = 1

From the first loop prev[1] had been incremented.

Thus result += prev[1] makes the result 1.

anton_lunyov · November 13, 2012, 2:39am

@betlista There is no connection between this two counts at all.
For this test
3
2 1 3
counts are 0 and 1
For this one
3
2 1 2 3 1
counts are 1 and 4.

As for the part of how to find count efficiently, I wrote this explicitly in my answer. After you find the square of the mentioned polynomial you can find the answer in O(V) time.

betlista · November 13, 2012, 2:58pm

I wrote some test:

#include <cstdio>

int main() {
	int n = 4;
	int A[] = { 8, 10, 12, 8 };
	for ( int j = 0; j < n; ++j ) {
		for ( int k = j+1; k < n; ++k ) {
			printf( "Ai=%d\n", 2*A[j] - A[k] );
		}
	}
	return 0;
}

and the printed result is

Ai=6
Ai=4
Ai=8
Ai=8
Ai=12
Ai=16

and prev[8] is 2, prev[12] is 1 (hope I understood it well), so the result is 5, but it should be 1, right?

betlista · November 13, 2012, 3:01pm

I think there is some mistake too, but I have no idea what is the correct answer, probably someone will help, see thread in my question.

bcurcio · November 13, 2012, 3:42pm

We are looking for arithmetic progression, all terms differ in the same amount. How do you calculate Ai having Aj and Ak?

Ai -> Aj -> Ak

Aj+N=Ak
Ai+N=Aj
From 1. N=Ak-Aj
From 2. Ai=Aj-N
From 3 and 4. Ai=Aj-(Ak-Aj)=Aj-Ak+Aj=2 Aj - Ak

betlista · November 13, 2012, 4:07pm

That’s crystal clear, but how to work with prev array, see my question. You simply do not know if Ai is before or after Aj, cause you have value not an index…

betlista · November 13, 2012, 5:22pm

I understood it already. We increment prev when we skip over j, not in the biggining, my fault.