QPOLYSUM - Editorial

fushar · December 12, 2012, 9:44pm

PROBLEM LINKS

DIFFICULTY

HARD

PREREQUISITES

PROBLEM

You are given integers M, Q, N, D, a polynomial P, and P(0) mod M, P(1) mod M, …, P(D) mod M. Calculate the value of (P(0) Q⁰ + P(1) Q¹ + … + P(N-1) Q^N-1) mod M.

EXPLANATION

Since the explanation uses complicated math expressions, we will provide the PDF version instead that can be downloaded here.

SETTER’S SOLUTION

Can be found here.

TESTER’S SOLUTION

Can be found here.

ACRush’S SOLUTION

Can be found here.

ACRush uses Karatsuba’s algorithm for calculating convolution in O(D ^{log 3}) time. Since the algorithm does not require any special primes like in number-theoretic transform, he can do all calculations in the convolution directly in modulo M. Looking at his source code, it seems that he also calculates A × B mod M in O(1) time using some hacks. This is a big time saver. ACRush, can you explain how you did the multiplication in constant time?

djdolls’S SOLUTION

Can be found here.

djdolls is the second solver of this problem. The setter of this problem cannot understand his solution completely. djdolls, would you explain your solution?

yeputons · December 13, 2012, 12:50am

I know the following hack for calculating A*B mod M. Let’s calculate res=A*B in long long, forget about overflow. Then we calculate tmp=round(A*B/M) in long double and subtract tmp*M from res. If long double doesn’t give a huge error (it doesn’t, but double does), we just got read from overflowing (because all we need is error in tmp less than 9). So, return res % M;

triplem · December 13, 2012, 2:24am

I never figured out the (Q-1,M)>1 case, but as far as I can tell (and I’m probably missing something), I can’t see how convolutions are needed at all.

After precomputing the inverses of factorials, polynomial interpolation can be done in O(D) time without the extra log D factor. First calculate x, x(x-1), x(x-1)…(x-D) and also also (x-D), (x-D)(x-D+1), (x-D)…(x), in O(D) time. Then the result is simply the sum of:

[ x(x-1)…(x-i+1) ] * [ (x-i-1)…(x-D) ] * inverse(i!) * inverse((D-i)!) * (-1 if D-i is odd) * (y[i])

which is just a constant product for each i, thus O(D) in total.

Secondly, you can prove (for the (Q-1,M)=1 case, and writing things in a slightly different way to the one used in the PDF) the whole sum is of the form:

f(N) = Q^0P(0) + Q^1P(1) + … + Q^(N-1)P(N-1)
= Q^Ng(N) - g(0)

where g(N) is a polynomial of degree D.

Also, f(N+1) - f(N)= Q^NP(N)= Q^Ng(N+1) - Q^(N-1)g(N)

So P(N) = Qg(N+1) - g(N)
Ie g(N+1) = (P(N) + g(N))/Q

If we let g(0) = c, we can calculate each term g(1), g(2), …, g(D+1) linearly in c. We can then interpolate the points g(0) up to g(D) and match with g(D+1) to calculate c. (We end up dividing by (Q-1)^(D+1), which is why that gcd requirement is needed).

Now we know g, so we can interpolate again to find the final answer.

No convolutions or extra log factors anywhere - all in straight O(D) time, unless I’m missing something.

ACRush · December 13, 2012, 4:08am

Let me start by saying I enjoy this problem, a nice problem.

I don’t think I have something new, just review some tricks.

The problem became much simpler we have (Q-1, M) = 1. I’m happy to figure out a new solution when M = Q^x for small x, which is also the key to the problem.
To compute A * B % M in almost O(1) time, the algorithm in yeputons’s post is exactly the same as mine. And using this technique, my C++ code takes about 6-7 seconds when K = 20000, so it may be possible to brute-force it using Java.
Using FFT in the last part is a better choice, I used Karatsuba’s algorithm in O(K^{1.58}).

bcurcio · December 13, 2012, 6:47pm

In section Calculating Answer Modulo M1, how did you arrive to the formula for \sum_0^{N-1} C^j_k Q^k? Is that really well known?

I was really far from solving this problem I was trying to find a way to compute this

djdolls · December 14, 2012, 1:11am

This was a really nice problem, probably the most difficult one on Codechef that I managed to solve.
My solution is very similar to what triplem has already described:

We are given an integer Q and a D-degree polynomial P(), and we want to calculate S(N) mod M for a large N.

S(N) = P(0) * Q^0 + P(1) * Q^1 + … + P(N) * Q^N

Since P() is a polynomial of degree at most D, the (D+1)-th derivative of P would be zero, this gives us the following relation:

\sum_(i = 0 to D + 1) (-1)^i * C(D + 1, i) * P(n - i) = 0, for all n > D, i.e.

P(n) = \sum_(i = 0 to D) (-1)^i * C(D + 1, i + 1) * P(n - i - 1)

Here C(i, j) represents the binomial coefficient.
If we define X(n) = P(n) * Q^n, then

X(n) = \sum_(i = 0 to D) (-1)^i * C(D + 1, i + 1) * P(n - i - 1) * Q^(i + 1),

Finally, if we add all the X(i)’s we get

S(n) = \sum(i = 0 to D) (-1)^i * C (D + 1, i + 1) * S(n - i - 1) * Q^(i + 1) + c,

where c = \sum_(i = 0 to D) (-1)^i * C(D + 1, i) * S(D - i) * Q^i

We can further expand S(n - i)’s until, we have an expression entirely in terms of S(0), S(1), …, S(D)

S(n) = \sum_(i = 0 to D) (-1)^i * C(n - D - 1 + i, i) * C(n, D -i) * S(D - i) * Q^(n - D + i) +

        c * \sum_(i = 0 to n - D - 1)    C(D + i, D) * Q^i

The first sum has only (D + 1) terms, and can be calculated in O (D + log n) time. Since M is not divisible by any number in [2, D + 14], the binomial coefficients involved can be calculated easily using inverse factorial in amortized O (1) time. Also c, can be calculated in O (D) time.

The second summation is the problem, as it has O (n) terms. Let us denote it by T (n, D)

T(n, D) = \sum_(i = 0 to n - D - 1) C(D + i, D) * Q^i

Luckily, there is a recurrence for T(n, D) in terms of D.

T(n, 0) = (Q^n - 1) / (Q - 1), and

T(n, D) = (C(n, D) * Q^(n - D) - T(n, D - 1)) / (Q - 1).

Ideally, using this we should be able to compute T(n, D) in O (D + log n) time. However, this computation involves division by (Q - 1), and there is no guarantee that (Q - 1) has a multiplicative inverse in mod M.

First, it may be possible that Q = 1, this case however turns easy as in this case
T(n, D) = C(n, D + 1), so we can compute it directly without using the recurrence.

If (Q - 1) is coprime to M, then we can compute its inverse, and T(n, D) can be calculated easily.
The only difficult part is when (Q - 1) is not coprime to M.

As described in the editorial, we can split M into two parts:
M = M1 * M2, where M1 is coprime to (Q - 1), and there exist a small k such that M2 divides
(Q - 1)^k. Based on the constraints in the problem we can deduce that k <= 14.

T(n, D) mod M1 can be calculated easily.
In order to compute T(n, D) mod M2, we instead compute T(n, D) mod (Q - 1)^k which can be used easily to compute the former.

If we look at the recurrence for T(n, D), we can easily notice how to compute T(n, D) mod (Q - 1).

T(n, D + 1) = (C(n, D + 1) * Q^(n - D - 1) - T(n, D)) / (Q - 1),

Since T(n, D + 1) is an integer, (Q - 1) must divide the numerator

This means T(n, D) = C(n, D + 1) * Q^(n - D - 1) mod (Q - 1)

Now we write the expression for T(n, D + 2), we can compute T(n, D) mod (Q - 1)^2

T(n, D) = (C(n, D + 1) - C(n, D + 2)) * Q^(n - D - 1) + C(n, D + 2) * Q^(n - D - 2) mod (Q - 1)^2

In general,

T(n, D) = \sum_(i = 1 to k) (\sum_(j = i to k) (-1)^(j - i) * C(j - 1, j - i) * C(n, D + j)) * Q^(n - D - i)
mod (Q - 1)^k

Since k <= 14, we can still compute involved binomial coefficients easily modulo M (all denominators have multiplicative inverses).

Hence, the overall complexity is O (D + lg N) assuming that operation modulo M can be done in O (1) time.

I had an overflow bug in my code, and it took me a while figure that out, only after finding that the test case
where my code was failing has a value of M, 998005893107997601, which is awfully close to 10^18.

anton_lunyov · December 13, 2012, 3:01am

This is probably exactly what djdolls wrote. Damn it, after dealing with this problem for so many time I’ve overcomplicated things too much

The case M divides (Q-1)^u can be handled even easier after we move from P(k) to T(k) as in the editorial. In this case sum of T(k) for k from 0 to N-1 is a polynomial S(N) of degree D+u and we can find it’s first D+u+1 values in simple linear loop summing T(k) and then use Lagrange interpolation formula as you’ve suggested to find S(N).

So with this O(D) solution and with hacks for finding A * B mod M I could set it with D about 500000. Hehe.

anton_lunyov · December 13, 2012, 3:06am

But you’ve missed one thing. To calculate g(1), …, g(D+1) in O(D) you need also gcd(Q,M)=1 to divide by Q. But this can be handled similarly to the editorial. Just subdivide the modulo in two components: M = M1*M2 where gcd(M1,Q)=1 and M2 divides Q^u for some u. Then for M1 your scheme is correct and for M2, Q^k for k>=u is zero modulo M2 so sum have only u non-zero terms modulo M2 and can be calculated in O(u) time.

BTW, I have tests where M has common prime factors with Q as well as Q-1 and also some that coprime with Q and Q-1. So such subdivision of M is really needed to get AC

iscsi · December 13, 2012, 8:56pm

I stole your code to try to understand your comment and made a two simple program:
The first with Java bigint:

the second with your code:

It gives me a different result… What am I doing wrong?

anton_lunyov · December 14, 2012, 12:25am

The sketch of the proof is noted there. Use induction on j. The main thing in inductive step is to differentiate the relation by Q. Then using basic identities for binomial coefficients we can prove all we need. I hope that this formula was not known in science before

djdolls · December 14, 2012, 1:13am

Does anybody if I can insert latex equations here, if so, how? The plaintext equations look horrible

iscsi · December 14, 2012, 2:03am

Now I know if I put a%=mod;b%=mod; than your code works great

niyaznigmatul · December 14, 2012, 2:31pm

@iscsi you multiply two numbers less than 2^31, it can be done in long longs. You had to try multiplication, when M is large.

iscsi · December 14, 2012, 2:41pm

@niyaznigmatul: Yes, you are right of course, firstly when I try yeputons code I forgot the a%=mod; b%=mod; lines. I just try to follow with his code without thinking and I didn’t understand why his code does/doesn’t work:) It’s my fault of course, now I understand… thank you for your help.

iscsi · December 14, 2012, 3:27pm

At the end of the contest, I’ve just figured out if I would know the polynomial coefficients C_i-s I can solve the problem (maybe I’m wrong, but I thought that:) )
I still saw the first part, what triplem write we can calculate in O(D). (what is the y[i] in that product?) If am I right triplem solution doesn’t calculate the original polynomial coefficients, so the original coefficients can be calculate faster than O(D^2)?

anton_lunyov · December 14, 2012, 7:07pm

@iscsi In fact I want to set the problem with coefficients but finding values P(0), P(1), …, P(D) by C_0, …, C_D can be done only in O(D log^2 D) time (as mentioned in some exercise in Cormen book). And I don’t know simpler way to solve the problem having C_i as an input. Probably inverse problem also can be made in this time. In triplem solution y[i] is g(i). He uses Lagrange interpolation formula to calculate g(N) by values g(0), …, g(D+1).

codeprav · December 19, 2012, 10:46am

will u please convert that to pdf and share the link as Anton has done …

codeprav · December 19, 2012, 5:52pm

@yeputons , will you please explain me the hack , i mean the proof that the hack will work in brief plz …