I’d like to share my journey with this problem (apologies but it’s very long).
During day 1, I started by trying to solve some easy ones first, and eventually found this one with no accepted solutions yet. Since the contest had just begun, I tried not to think too much: the examples given were [1,1,1] and [2,-4,1,1], both of which involve the vector [1,1,1…] of all ones, so I thought the answer must somehow always involve such a vector. I tried to find a few more cases (such as [1,2,3,1,2,3] being annihilated by [1,0,0,-1,0,0]), and came up with my first (naive) solution. Here it is written in pseudo-python (link to my submission here):
for every divisor d of n:
w = ((array of length d))
for i from 0 to d - 1:
w* = a* + a[i + d] + a[i + 2*d] + ...
# this is a subroutine
for every divisor d of n:
# check if sums of subarrays of distance d are all zero
all_zero = True
for i from 0 to d - 1:
s = a* + a[i + d] + a[i + 2*d] + ...
if s is not 0:
all_zero = False
for every proper divisor d of n:
# check if subarrays of distance d have all equal elements
all_eq = True
for i from d to n - 1:
if v* != v[i - d]:
all_eq = False
This is an O(n log2 n) algorithm. It somehow passed the initial dataset of the problem, even though it’s wrong (the first counter-example I discovered is [1,-1,1,0,0,0], which can be annihilated by [1,1,0,-1,-1,0]). Luckily the dataset was fixed immediately (and the solution then got WA).
After actually thinking about the problem, it became clear that we needed to determine the singularity of a certain special matrix, each row of which is a single right-rotation of the previous row. After a brief search, I discovered that such a matrix is actually called a circulant matrix, and the Wikipedia page conveniently provided a few ways to determine singularity.
It was clear that if the matrix A is nonsingular, then the only solution to Ax = 0 is the vector x = 0, and if A is singular, then Ax = 0 has a nonzero solution x whose elements are rational, and hence an integral solution (using a proper scaling factor). That there exists a nonzero x whose elements are rational can be confirmed by doing Gaussian elimination with total pivoting on Ax = 0 until we fail to find a pivot (this will happen because A is singular), in which case we can assign the remaining variables 1, and solve for the others. It is clear that all intermediate and final values are rational.
The first property I considered was the ‘Rank’. Wikipedia says that the rank is n - D, where D is the degree of gcd(a(x), xn-1) (and a(x) = a0 + a1·x + a2·x2 + … + an-1·xn-1). Now, we only needed to check whether D is zero or not, i.e., gcd(a(x), xn-1) is a constant or not. I actually found a theoretically good one to calculate polynomial gcd, running in O(n log2 n), based on the concept of “Half-gcd”. This relies on using FFT for polynomial multiplication AND division (see this link for a description on how to reduce polynomial division and modulo to multiplication). However, after playing around with it a bit and implementing it I discovered that there are far too many multiplications and divisions for this approach to pass the time limit (it started getting slow even at n around 10000), so I scratched this approach.
Next, I factorized xn-1 into prime polynomials, which are simply the cyclotomic polynomials Φn(x) (Φn(x) is the monic polynomial whose roots are the primitive nth roots of unity). xn-1 is simply the product of Φd(x) for every divisor d of n. Since the cyclotomic polynomials are prime, D is nonzero if and only if one of those Φd(x) divides a(x).
This means I only had to do polynomial modulo operation for every Φd(x). Among all n in [1,30000], 27720 has the largest number of divisors, at 96. This means that this solution runs in about 96·O(n log n) time. I was very excited with this solution because I already knew how to code all the parts, unlike the HGCD approach, so I spent some time coding it, but after submitting, the verdict was TLE. Clearly the constant is still too high, and this approach only works well if divisor_count(n) isn’t too large.
By the way, whenever I implement FFT, I always do it modulo a large prime p which has a primitive 2kth root of unity for a large enough k (say p=2013265921 where 440564289 is a primitive 227th root of unity). This allows me to use 440564289 as my root of unity instead of a complex irrational, and thus using integers entirely and avoiding precision errors.
Of course, this means that the coefficients will be reduced mod p. However, for this problem, I simply assume that if A(x) divides B(x) for some polynomials A and B in Zp[x], then A(x) also divides B(x) in Z[x]. This has a tiny chance of failing which I assumed would not happen on the judge’s data.
So I left the ‘Rank’ approach and considered the ‘determinant’ next. We needed to know if the determinant is zero or not. According to the Wikipedia page again, the determinant of a circulant matrix is the product of the evaluation of a(x) for all nth roots of unity. In other words, we needed to find the DFT of the sequence a0, a1, …, an-1, and check whether any of the values of the resulting transform is zero. The problem is that the normal FFT algorithm only works for power-of-2 sizes, so we have to find a way to calculate the DFT of an exact size.
An issue that came up is that we needed to have a primitive nth root of unity in our modulus, so p=2013265921 wouldn’t work most of the time. My solution is to simply write a list of large primes such that for every n in the range [1…30000], there is a prime p in the list where p ≡ 1 (mod n). To meet the 50000 bytes program limit I had to compress the list and decompress it at the start of execution.
The first thing I thought of is that the normal FFT algorithm can be extended. In FFT, we split into two parts of size n/2 and then combine in O(n) time, but it is possible to split into p parts, each of size n/p, for any prime p dividing n, and combine in time O(p·n). If we continue to do this for all prime factors of n (considering multiplicity), we will eventually be able to compute the DFT!
The only problem is that p can be large. If n is composite, then we can always find a p ≤ √n ≤ 173, and O(p·n) will be waitable. But if n is prime, then O(p·n) = O(n2) which is clearly too slow. So I had to think more of how to solve that part, but for now this approach works well if n doesn’t have large enough prime factors (I tried implementing this first and submitting, and the judge indeed gave the expected TLE).
At this point I took a dinner break, and while walking to the store I thought of a cool idea. Why not combine my two previous approaches? I figured that if n has a large prime factor, it cannot have that many divisors, so my first algorithm would work well! I became very excited with this idea and I tried implementing it immediately (after eating), having only needed to do a few copy-pasta. Sure enough, the solution didn’t get TLE! Unfortunately, it got WA.
Of course it turned out later that it only got WA because there were some issues with the judge data. After rejudging, this solution got accepted (link here). However I didn’t know that it was correct at the time, so I suspected that working modulo primes was bad after all.
At this point I felt that solving DFT problem for prime sizes was the best approach, so after a bit of research I found two algorithms, Rader’s algorithm and Bluestein’s algorithm. Both solutions are amazingly simple and clever, and work in O(n log n). Bluestein’s algorithm has a bonus that it works for any size, not just prime sizes, but for some random reason I decided to use Rader’s algorithm, perhaps because I already have code for finding a generator for a given prime p.
There were a few details to consider, but I was able to finally implement it (link here). However the judge still gave WA (because of judge data issues). At this point I concluded that there must be something wrong with the data, or the data is too clever that it somehow finds a bad input to my essentially random list of primes (out of desperation I even tried to submit a double-precision version of FFT, which of course got WA)! Since I felt the former to be more likely, I simply gave up and waited until perhaps things got resolved. Sure enough, I later found @mugurelionut being able to solve it, followed by @djdolls, with no WAs in between! Since I found this rather unlikely, I tried to submit my latest solution and sure enough it got AC. I felt a lot of relief afterwards, knowing that I can finally move on with my life!
It was indeed very frustrating to get many WAs due to the issues with the judge data, but I think there has been some positive side effect on me because I was able to learn a lot of new things. So thank you @KADR for proposing the problem, and I hope you handle the input/output files correctly next time!