PROBLEM LINKSDIFFICULTYHARD PREREQUISITES
PROBLEMYou are given integers M, Q, N, D, a polynomial P, and P(0) mod M, P(1) mod M, ..., P(D) mod M. Calculate the value of (P(0) Q^{0} + P(1) Q^{1} + ... + P(N1) Q^{N1}) mod M. EXPLANATIONSince the explanation uses complicated math expressions, we will provide the PDF version instead that can be downloaded here. SETTER'S SOLUTIONCan be found here. TESTER'S SOLUTIONCan be found here. ACRush'S SOLUTIONCan be found here. ACRush uses Karatsuba's algorithm for calculating convolution in O(D ^{log 3}) time. Since the algorithm does not require any special primes like in numbertheoretic transform, he can do all calculations in the convolution directly in modulo M. Looking at his source code, it seems that he also calculates A × B mod M in O(1) time using some hacks. This is a big time saver. ACRush, can you explain how you did the multiplication in constant time? :) djdolls'S SOLUTIONCan be found here. djdolls is the second solver of this problem. The setter of this problem cannot understand his solution completely. djdolls, would you explain your solution? :)
This question is marked "community wiki".
asked 12 Dec '12, 21:44

Let me start by saying I enjoy this problem, a nice problem. I don't think I have something new, just review some tricks. 1) The problem became much simpler we have (Q1, M) = 1. I'm happy to figure out a new solution when M = Q^x for small x, which is also the key to the problem. 2) To compute A * B % M in almost O(1) time, the algorithm in yeputons's post is exactly the same as mine. And using this technique, my C++ code takes about 67 seconds when K = 20000, so it may be possible to bruteforce it using Java. 3) Using FFT in the last part is a better choice, I used Karatsuba's algorithm in O(K^{1.58}). answered 13 Dec '12, 04:08

I know the following hack for calculating answered 13 Dec '12, 00:50
I stole your code to try to understand your comment and made a two simple program: The first with Java bigint: http://ideone.com/a8y0OA the second with your code: http://ideone.com/P8MWGt It gives me a different result.. What am I doing wrong?
(13 Dec '12, 20:56)
Now I know if I put a%=mod;b%=mod; than your code works great :)
(14 Dec '12, 02:03)
1
@iscsi you multiply two numbers less than 2^31, it can be done in long longs. You had to try multiplication, when M is large.
(14 Dec '12, 14:31)
@niyaznigmatul: Yes, you are right of course, firstly when I try yeputons code I forgot the a%=mod; b%=mod; lines. I just try to follow with his code without thinking and I didn't understand why his code does/doesn't work:) It's my fault of course, now I understand.. thank you for your help.
(14 Dec '12, 14:41)
@yeputons , will you please explain me the hack , i mean the proof that the hack will work in brief plz ....
(19 Dec '12, 17:52)

I never figured out the (Q1,M)>1 case, but as far as I can tell (and I'm probably missing something), I can't see how convolutions are needed at all. After precomputing the inverses of factorials, polynomial interpolation can be done in O(D) time without the extra log D factor. First calculate x, x(x1), x(x1)..(xD) and also also (xD), (xD)(xD+1), (xD)...(x), in O(D) time. Then the result is simply the sum of: [ x(x1)..(xi+1) ] * [ (xi1)..(xD) ] * inverse(i!) * inverse((Di)!) * (1 if Di is odd) * (y[i]) which is just a constant product for each i, thus O(D) in total. Secondly, you can prove (for the (Q1,M)=1 case, and writing things in a slightly different way to the one used in the PDF) the whole sum is of the form: f(N) = Q^0P(0) + Q^1P(1) + .. + Q^(N1)P(N1) where g(N) is a polynomial of degree D. Also, f(N+1)  f(N)= Q^NP(N)= Q^Ng(N+1)  Q^(N1)g(N) So P(N) = Qg(N+1)  g(N) If we let g(0) = c, we can calculate each term g(1), g(2), .., g(D+1) linearly in c. We can then interpolate the points g(0) up to g(D) and match with g(D+1) to calculate c. (We end up dividing by (Q1)^(D+1), which is why that gcd requirement is needed). Now we know g, so we can interpolate again to find the final answer. No convolutions or extra log factors anywhere  all in straight O(D) time, unless I'm missing something. answered 13 Dec '12, 02:24
This is probably exactly what djdolls wrote. Damn it, after dealing with this problem for so many time I've overcomplicated things too much :( The case M divides (Q1)^u can be handled even easier after we move from P(k) to T(k) as in the editorial. In this case sum of T(k) for k from 0 to N1 is a polynomial S(N) of degree D+u and we can find it's first D+u+1 values in simple linear loop summing T(k) and then use Lagrange interpolation formula as you've suggested to find S(N). So with this O(D) solution and with hacks for finding A * B mod M I could set it with D about 500000. Hehe.
(13 Dec '12, 03:01)
But you've missed one thing. To calculate g(1), ..., g(D+1) in O(D) you need also gcd(Q,M)=1 to divide by Q. But this can be handled similarly to the editorial. Just subdivide the modulo in two components: M = M1*M2 where gcd(M1,Q)=1 and M2 divides Q^u for some u. Then for M1 your scheme is correct and for M2, Q^k for k>=u is zero modulo M2 so sum have only u nonzero terms modulo M2 and can be calculated in O(u) time. BTW, I have tests where M has common prime factors with Q as well as Q1 and also some that coprime with Q and Q1. So such subdivision of M is really needed to get AC :)
(13 Dec '12, 03:06)
At the end of the contest, I've just figured out if I would know the polynomial coefficients C_is I can solve the problem (maybe I'm wrong, but I thought that:) ) I still saw the first part, what triplem write we can calculate in O(D). (what is the y[i] in that product?) If am I right triplem solution doesn't calculate the original polynomial coefficients, so the original coefficients can be calculate faster than O(D^2)?
(14 Dec '12, 15:27)
@iscsi In fact I want to set the problem with coefficients but finding values P(0), P(1), ..., P(D) by C_0, ..., C_D can be done only in O(D log^2 D) time (as mentioned in some exercise in Cormen book). And I don't know simpler way to solve the problem having C_i as an input. Probably inverse problem also can be made in this time. In triplem solution y[i] is g(i). He uses Lagrange interpolation formula to calculate g(N) by values g(0), ..., g(D+1).
(14 Dec '12, 19:07)

In section Calculating Answer Modulo M1, how did you arrive to the formula for sum_0^{N1} C^j_k Q^k? Is that really well known? I was really far from solving this problem I was trying to find a way to compute this answered 13 Dec '12, 18:47
2
The sketch of the proof is noted there. Use induction on j. The main thing in inductive step is to differentiate the relation by Q. Then using basic identities for binomial coefficients we can prove all we need. I hope that this formula was not known in science before :)
(14 Dec '12, 00:25)
