PROBLEM LINK:Author: Iaroslav Tverdokhlib DIFFICULTY:Hard PREREQUISITES:Circulant Matrix PROBLEM:Given a Circulant Matrix, determine whether it is full rank (i.e. nonzero determinant, or the null space is trivial). EXPLANATION:First of all, we need to know the matrix stated in the problem is a very special type of matrix called Circulant Matrix. There are a lot of properties about the determinant and rank of Circulant Matrix can be found in Wikipedia. Let's summarize the properties looks relevant to this problem: where, f(x) = A_{0} + A_{1} * x + A_{2} * x^2 + ... A_{n} * x^n. To avoid the precision problems, we can choose some prime and one of its primitive root instead of the root of unity (it may have chances (really rare, but exists) to fail, but we can choose some different primes to check, e.g. 3 different primes). Also, it will better to try Number Theoretic Transform instead of doublevalued FFT in the following steps. One straightforward way is to calculate the determinant directly (of course, using FFT or NTT, both setter and tester solved the problem in this way). But a lot of contestants failed due to the precision problem when checking whether a double value is nonzero. Another way is the check whether "d = 0" i.e. gcd(x^n  1, f(x)) is some constant. Here we take the later one to solve this problem. Another related knowledge is about Cyclotomic Polynomial. With this property, we know that x^n1 can be factorized into some cyclotomic polynomials Phi_{d}(x) for every divisor d of n. Actually, for all n between 1 and 30000, the number of divisors, denoted as D, is at most 96 (when n = 27720). Therefore, we can enumerate all possible divisors d and check the cyclotomic polynomial Phi_{d}(x) can divide f(x). Then, the only thing we need to do is the polynomial remainder operation. If we adopt bruteforce to do this, it will take O((Nd) * d) time. Directly applying this method (with some breaks) may get TLE. But there is a interesting view to combine this method with the doublevalued FFT (like this method can work well when N is a prime). This solution accepted with a further check using this method. The fastest and easiest solution (explanation provided by Anton Lunyov)Let's understand how the fastest and shortest solution to this problem works. Some polynomial P(x) is divisible by Phi_{d}(x) if and only if P(x) * (x^{d/p1}  1) * ... * (x^{d/pk}  1) is divisible by x^{d}  1. It follows from explicit formula for cyclotomic polynomial (see second formula there). Indeed the rational function Now all is simple. Polynomial computations modulo x^{d}  1 are straightforward: if P(x) = a_{0} + a_{1} * x + ... + a_{m} * x^{m} then P(x) mod (x^{d}  1) = (a_{0} + a_{d} + a_{2d} + ... ) + (a_{1} + a_{d+1} + a_{2d+1} + ... ) * x + ... + (a_{d1} + a_{2d1} + ... ) * x^{d1} (for clarity see lines 4950 here). Multiplication by x^{k}  1 could be made by simple loop and requires only additions and subtractions. And it could be done modulo x^{d}  1 at the same time (see lines 5362 here). So we could easily find the remainder of polynomial division P(x) * (x^{d/p1}  1) * ... * (x^{d/pk}  1) by x^{d}  1. If it's zero for some d then the answer is YES, otherwise NO. The complexity of each step is O(n + d * nu(d)), where nu(d) is the number of prime divisors of d. Here we have n because we need first to adjust P(x) modulo x^{d}  1. Thus, overall complexity is (n * tau(n) + n * log n), where tau(n) is the number of divisors of n. Here n * log n is some estimate of the sum of d * nu(d) over divisors o n (I believe it is so :)). Hence we get super fast and simple solution that requires only additions and subtractions of polynomial coefficients. Any different ideas and solutions are welcome. AUTHOR'S AND TESTER'S SOLUTIONS:Author's solution can be found here.
This question is marked "community wiki".
asked 16 Dec '13, 18:56
showing 5 of 7
show all

I initially implemented FFT on real numbers (double, then long double), but I always had precision problems (actually, to be honest, since I was too lazy to implement an FFT algorithm working for any value of N, e.g. N being prime, I used a freely available implementation of Bluestein's FFT algorithm). But I had to implement something very different in order to have my solution accepted. If you think of the values A(i) as coefficients of a polynomial (A(0) + A(1) * x + A(2) * x^2 + ...), then the answer is YES only if this polynomial is divisible by a cyclotomic polynomial Phi_D(X) for some divisor D of N. My implementation of this idea is actually straightforward: I computed all the cyclotomic polynomials for divisors of N and I checked divisibility with each of them. The important thing here is that cyclotomic polynomials have integer coefficients, so there is no need for using real numbers (thus, any possible precision errors are avoided). answered 16 Dec '13, 21:19
I also implemented Bluestein's algorithm, also using long double and also had trouble with precision (running time 19 seconds, WA). This problem was probably aimed at failing any common FFT.
(16 Dec '13, 21:24)
if the polynomial is divisible by a cyclotomic polynomial Phi_D(X) for some divisor D of N.Could you explain that?
(16 Dec '13, 21:28)
@jaskaran_1: See Wikipedia for the definition and properties of cyclotomic polynomials: http://en.wikipedia.org/wiki/Cyclotomic_polynomial
(16 Dec '13, 21:55)
@mugurelionut how about using Number Theoretic Transform?
(16 Dec '13, 23:33)

I'd like to share my journey with this problem (apologies but it's very long). During day 1, I started by trying to solve some easy ones first, and eventually found this one with no accepted solutions yet. Since the contest had just begun, I tried not to think too much: the examples given were [1,1,1] and [2,4,1,1], both of which involve the vector [1,1,1...] of all ones, so I thought the answer must somehow always involve such a vector. I tried to find a few more cases (such as [1,2,3,1,2,3] being annihilated by [1,0,0,1,0,0]), and came up with my first (naive) solution. Here it is written in pseudopython (link to my submission here): def is_beautiful(a[0..n1]): for every divisor d of n: w = ((array of length d)) for i from 0 to d  1: w[i] = a[i] + a[i + d] + a[i + 2*d] + ... if check(w): return True return False # this is a subroutine def check(a[0..n1]): for every divisor d of n: # check if sums of subarrays of distance d are all zero all_zero = True for i from 0 to d  1: s = a[i] + a[i + d] + a[i + 2*d] + ... if s is not 0: all_zero = False break if all_zero: return True for every proper divisor d of n: # check if subarrays of distance d have all equal elements all_eq = True for i from d to n  1: if v[i] != v[i  d]: all_eq = False break if all_eq: return True return False This is an O(n log^{2} n) algorithm. It somehow passed the initial dataset of the problem, even though it's wrong (the first counterexample I discovered is [1,1,1,0,0,0], which can be annihilated by [1,1,0,1,1,0]). Luckily the dataset was fixed immediately (and the solution then got WA). After actually thinking about the problem, it became clear that we needed to determine the singularity of a certain special matrix, each row of which is a single rightrotation of the previous row. After a brief search, I discovered that such a matrix is actually called a circulant matrix, and the Wikipedia page conveniently provided a few ways to determine singularity. It was clear that if the matrix A is nonsingular, then the only solution to Ax = 0 is the vector x = 0, and if A is singular, then Ax = 0 has a nonzero solution x whose elements are rational, and hence an integral solution (using a proper scaling factor). That there exists a nonzero x whose elements are rational can be confirmed by doing Gaussian elimination with total pivoting on Ax = 0 until we fail to find a pivot (this will happen because A is singular), in which case we can assign the remaining variables 1, and solve for the others. It is clear that all intermediate and final values are rational. The first property I considered was the 'Rank'. Wikipedia says that the rank is n  D, where D is the degree of gcd(a(x), x^{n}1) (and a(x) = a_{0} + a_{1}·x + a_{2}·x^{2} + ... + a_{n1}·x^{n1}). Now, we only needed to check whether D is zero or not, i.e., gcd(a(x), x^{n}1) is a constant or not. I actually found a theoretically good one to calculate polynomial gcd, running in O(n log^{2} n), based on the concept of "Halfgcd". This relies on using FFT for polynomial multiplication AND division (see this link for a description on how to reduce polynomial division and modulo to multiplication). However, after playing around with it a bit and implementing it I discovered that there are far too many multiplications and divisions for this approach to pass the time limit (it started getting slow even at n around 10000), so I scratched this approach. Next, I factorized x^{n}1 into prime polynomials, which are simply the cyclotomic polynomials Φ_{n}(x) (Φ_{n}(x) is the monic polynomial whose roots are the primitive nth roots of unity). x^{n}1 is simply the product of Φ_{d}(x) for every divisor d of n. Since the cyclotomic polynomials are prime, D is nonzero if and only if one of those Φ_{d}(x) divides a(x). This means I only had to do polynomial modulo operation for every Φ_{d}(x). Among all n in [1,30000], 27720 has the largest number of divisors, at 96. This means that this solution runs in about 96·O(n log n) time. I was very excited with this solution because I already knew how to code all the parts, unlike the HGCD approach, so I spent some time coding it, but after submitting, the verdict was TLE. Clearly the constant is still too high, and this approach only works well if divisor_count(n) isn't too large. By the way, whenever I implement FFT, I always do it modulo a large prime p which has a primitive 2^{k}th root of unity for a large enough k (say p=2013265921 where 440564289 is a primitive 2^{27}th root of unity). This allows me to use 440564289 as my root of unity instead of a complex irrational, and thus using integers entirely and avoiding precision errors. Of course, this means that the coefficients will be reduced mod p. However, for this problem, I simply assume that if A(x) divides B(x) for some polynomials A and B in Z_{p}[x], then A(x) also divides B(x) in Z[x]. This has a tiny chance of failing which I assumed would not happen on the judge's data. So I left the 'Rank' approach and considered the 'determinant' next. We needed to know if the determinant is zero or not. According to the Wikipedia page again, the determinant of a circulant matrix is the product of the evaluation of a(x) for all nth roots of unity. In other words, we needed to find the DFT of the sequence a_{0}, a_{1}, ..., a_{n1}, and check whether any of the values of the resulting transform is zero. The problem is that the normal FFT algorithm only works for powerof2 sizes, so we have to find a way to calculate the DFT of an exact size. An issue that came up is that we needed to have a primitive nth root of unity in our modulus, so p=2013265921 wouldn't work most of the time. My solution is to simply write a list of large primes such that for every n in the range [1..30000], there is a prime p in the list where p ≡ 1 (mod n). To meet the 50000 bytes program limit I had to compress the list and decompress it at the start of execution. The first thing I thought of is that the normal FFT algorithm can be extended. In FFT, we split into two parts of size n/2 and then combine in O(n) time, but it is possible to split into p parts, each of size n/p, for any prime p dividing n, and combine in time O(p·n). If we continue to do this for all prime factors of n (considering multiplicity), we will eventually be able to compute the DFT! The only problem is that p can be large. If n is composite, then we can always find a p ≤ √n ≤ 173, and O(p·n) will be waitable. But if n is prime, then O(p·n) = O(n^{2}) which is clearly too slow. So I had to think more of how to solve that part, but for now this approach works well if n doesn't have large enough prime factors (I tried implementing this first and submitting, and the judge indeed gave the expected TLE). At this point I took a dinner break, and while walking to the store I thought of a cool idea. Why not combine my two previous approaches? I figured that if n has a large prime factor, it cannot have that many divisors, so my first algorithm would work well! I became very excited with this idea and I tried implementing it immediately (after eating), having only needed to do a few copypasta. Sure enough, the solution didn't get TLE! Unfortunately, it got WA. Of course it turned out later that it only got WA because there were some issues with the judge data. After rejudging, this solution got accepted (link here). However I didn't know that it was correct at the time, so I suspected that working modulo primes was bad after all. At this point I felt that solving DFT problem for prime sizes was the best approach, so after a bit of research I found two algorithms, Rader's algorithm and Bluestein's algorithm. Both solutions are amazingly simple and clever, and work in O(n log n). Bluestein's algorithm has a bonus that it works for any size, not just prime sizes, but for some random reason I decided to use Rader's algorithm, perhaps because I already have code for finding a generator for a given prime p. There were a few details to consider, but I was able to finally implement it (link here). However the judge still gave WA (because of judge data issues). At this point I concluded that there must be something wrong with the data, or the data is too clever that it somehow finds a bad input to my essentially random list of primes (out of desperation I even tried to submit a doubleprecision version of FFT, which of course got WA)! Since I felt the former to be more likely, I simply gave up and waited until perhaps things got resolved. Sure enough, I later found @mugurelionut being able to solve it, followed by @djdolls, with no WAs in between! Since I found this rather unlikely, I tried to submit my latest solution and sure enough it got AC. I felt a lot of relief afterwards, knowing that I can finally move on with my life! It was indeed very frustrating to get many WAs due to the issues with the judge data, but I think there has been some positive side effect on me because I was able to learn a lot of new things. So thank you @KADR for proposing the problem, and I hope you handle the input/output files correctly next time! answered 17 Dec '13, 01:39
3
Really awesome explanation @kevinsogo, I am always amazed by how clever and insightful some top coders can be and this is for sure an inspiration for me to keep working harder on my side to improve :D Thanks for the amazing story ;)
(17 Dec '13, 02:38)
2
waw. amazing explaination throughout your trials and errors. thanks a lot, mate ! i'll try to get AC with your ideas. thank you so much !
(17 Dec '13, 04:43)

Just a note when multiplying two polynomials quickly. Assume we want to multiply f(x) and g(x) exactly, using FFT. Using doubleprecision and using e^{2πi/n} as the primitive nth root of unity is too risky because precision errors are very likely to come up. An alternative is to choose a prime p with a modular 2^{k}th root of unity for a large enough k (i.e. 2^{k} > deg f + deg g) and just calculate f(x)·g(x) (mod p), but instead of using e^{2πi/n}, we use a modular nth root of unity mod p. This avoids precision errors, but an obvious problem is that the coefficients of the product will be reduced mod p. But let's say we know that any coefficient of f(x) can have an absolute value of at most F, and any coefficient of g(x) can have an absolute value of at most G. Then we are sure that any coefficient of f(x)·g(x) can have an absolute value of at most F·G·(min(deg f, deg g)+1) (let's call this quantity M). So a strategy to calculate f(x)·g(x) exactly is:
Note that since we have p_{1}·p_{2}·...·p_{l} > 2M, any number has at most one representative modulo p_{1}·p_{2}·...·p_{l} having absolute value ≤ M. Hence, the constructed coefficients are guaranteed to be correct. If in case you need to calculate f(x)·g(x) modulo some prime P, but you don't have a control over what P is, then most likely there wouldn't be a 2^{k}th root of unity mod P, so FFT mod P wouldn't work. A viable strategy is to calculate f(x)·g(x) exactly first as above, and then reduce it mod P afterwards (I actually used this trick for REALSET, somewhere in this submission. MOD0 and MOD1 are the two primes I chose). answered 17 Dec '13, 20:36
Precision is usually not a problem. Ever. Out of hundreds of problems that require doubles among thousands I've solved, this is the 2nd one where I'm getting WA because of precisions errors.
(18 Dec '13, 05:47)

I've updated editorial with the detailed explanation of the fastest solution. answered 18 Dec '13, 12:14
thank you for the update ! i tried to understand what did that check() function too... :) well done !
(19 Dec '13, 23:33)

My solution check whether polynomial is divided by Cyclotomic Polynomial order m where m is divisor of n. We just need to check that P(x) * Product[x^(m / d)  1] mod (x ^ m  1) == 0, where d is prime divisor of m. answered 18 Dec '13, 05:05

I did the same thing.Find the fft of the n real inputs ie c0,c1,c2,.....,c(n1) as mentioned in the circulant determinant link. .Now check for each output
Here's my code http://www.codechef.com/viewsolution/3090877 Is my fft implementation incorrect?(Radix2 Cooley Tukey) answered 16 Dec '13, 19:33
Maybe it is sucked by the precision (exceeding or some losses). The general idea looks fine. You can have a look at accepted solutions and find the differences. :)
(16 Dec '13, 20:36)
1
@shanjingbo Could you give me the failing test case and provide a detailed FFT implementation? How to implement it to maximize precision.What about using Number Theoretic Transform? Read it somewhere that there won't be precision error there since its modulo prime. Could you provide a C++ implementation for that.
(16 Dec '13, 20:45)
The key point is to use discrete FFT, and using Chinese Remainder Theorem (CRT). I have no codes now. And I think providing cases is not a good idea. Sorry.
(16 Dec '13, 21:03)
2
@jaskaran_1: I looked at your implementation and your algorithm is not correct. You can only use the Radix2 Cooley Tukey FFT algorithm when N is a power of 2. When N is odd you split the array into two slices of unequal sizes. You probably make them equal by adding an extra zero coefficient in the smaller slice, but that is incorrect and the end result is not the FFT for the initial array of N values. There are only a few algorithms which work for any value of N (N being prime is usually a tough case): Bluestein's algorithm is one of them.
(16 Dec '13, 21:26)
@mugurelionut Won't it work like mergesort.I mean there also the array isn't exactly halved.Why isn't it correct? Could you give me an example where it fails?
(16 Dec '13, 23:31)
jaskaran: If you actually looked more into how/why it works this way, you'd notice that CooleyTukey makes use of identities involving the Nth root of unity, and those are only nice if we split the FFT'd array into arrays of equal sizes. You're basically padding the array to next larger power of 2, which changes its roots of unity. The most trivial example: {1,1,0} has no solution, but your answer is YES. Is it really that hard to check small cases by hand when you have days for this?
(17 Dec '13, 00:54)
@xelios I had given the incorrect link. I've edited it.Could you check it now?It works for 1 1 0.
(17 Dec '13, 01:57)
jaskaran: Except it does not work for {1,1,0}. Read what I wrote again. Your solution answers YES, the correct answer is NO.
(17 Dec '13, 04:03)
xelios:Why is it giving NO on my compiler but not on IDEONE?
(17 Dec '13, 10:59)
I don't know your platform, so I can't answer that... but I'd expect the Ideone to be more similar to Codechef testing system...
(18 Dec '13, 01:38)
showing 5 of 10
show all

I don't mean to be rude, but this is not what can be expected for an editorial for a HARD problem. There are many concepts to be explained, many difficulties around (how you come to fast discrete fourier transform, how you use the Bluestein to handle cases where N is not a power of 2, how you handle floating point numbers precision, how you handle overflow in recursion, and so on), and you only provide a wikipedia link... I could have done the same. Telling that we can read other contestants code is not a solution, it's a lazy way of not answering the question. I'm kind of disappointed.
I think the editorial is in the process of being written, so you shouldn't flame just because you're impatient...
I mean, I hope it is so...
@cyberax I will try to explain these questions more clearly and thoroughly. Thanks for your concerning.
@shangjingbo: thanks a lot.
@cyberax: updated :)
@anton_lunyov: thanks for you explain!
sum of d * nu(d) is O(n * log(n), because sum of d is O(n) and max(nu(d)) is O(log(n)).