PROBLEM LINK:Author: Trung Nguyen Tester: Suchan Park Editorialist: Suchan Park DIFFICULTY:Hard PREREQUISITES:Generating functions, Faulhaber's formula, modular inverse, how to compute convolutions using FFT PROBLEM:The problem is short, so just take a look. QUICK EXPLANATION:EXPLANATION:$$\begin{aligned} f(x, k) & = \sum_{n=1}^{N} (x + a_{n})^k \\ & = \sum_{n=1}^{N} \sum_{i=0}^{k} \binom{k}{i} a_n^{ki} x^i \\ & = \sum_{i=0}^{k} \sum_{n=1}^{N} \binom{k}{i} a_n^{ki} x^i \\ & = \sum_{i=0}^{k} \left\{\binom{k}{i} x^i \times \left (\sum_{n=1}^{N} a_n^{ki}\right ) \right\} \end{aligned}$$ This form immediately suggests, that we might be able to make use of convolutions. If we let $L_x(i) = \binom{k}{i} x^i$ and $R(i) = \sum_{n=1}^{N} a_n^{i}$, we can write the above equation as: $$f(x, k) = \sum_{i=0}^{k} L_x(i) \cdot R(ki)$$ However, the first problem we encounter is that $x$ is not fixed, actually there are at most $10^{18}$ possible values of $x$. So let's stop here, and try to write down the equation of $g(t, k)$. $$\begin{aligned} g(t, k) & = \sum_{x=0}^{t} f(x, k) \\ & = \sum_{x=0}^{t} \sum_{i=0}^{k} \left\{\binom{k}{i} x^i \times \left (\sum_{n=1}^{N} a_n^{ki}\right ) \right\} \\ & = \sum_{i=0}^{k} \sum_{x=0}^{t} \left \{ \binom{k}{i} x^i \times \left (\sum_{n=1}^{N} a_n^{ki}\right ) \right \} \\ & = \sum_{i=0}^{k} \left \{ \binom{k}{i} \sum_{x=0}^{t} x^i \times \left (\sum_{n=1}^{N} a_n^{ki}\right ) \right \}\\ \end{aligned}$$ So, this also looks like a convolution. If we let $L(i) = \binom{k}{i} \sum_{x=0}^{t} x^i$ and $R(i) = \sum_{n=1}^{N} a_n^{i}$, we can write $g(t, k)$ as $$g(t, k) = \sum_{i=1}^{k} L(i) \cdot R(k  i)$$ and this can be computed by FFT. Putting aside of dealing with weird modulo $10^9 + 7$, the only thing left is to compute $L$ and $R$. Computing $L$It is wellknown to compute $\binom{k}{i}$. If we precompute all factorials and inverse of factorials modulo $10^9 + 7$, $\binom{k}{i} \equiv k! \cdot (i!)^{1} \cdot \left((ki)!\right)^{1} \mod 10^9 + 7$. Let's deal with the $\sum_{x=0}^{t} x^i$ part. The form seems so wellknown, so when we google it, we are able to find the solution  Faulhaber's formula (I changed the names of variables to match with our problem) $$\sum_{x=0}^{t} x^i = \frac{t^{i+1}}{i+1} + \frac{1}{2} t^i + \sum_{k=2}^{i} \frac{B_k}{k!}\frac{i!}{(ik+1)!}t^{ik+1}$$ where $B_{k}$ are Bernoulli numbers. Suppose we don't know what it is. The first two terms are straightforward to calculate. We just need the modular inverses of $1, 2, \cdots, k$. The rightmost term seems complicated, but if we assume we know the values of $B_{k}$, we can see, from $k$ and $ik$, $$\begin{aligned} \sum_{k=2}^{i} \frac{B_k}{k!}\frac{i!}{(ik+1)!}t^{ik+1} & = i! \cdot \sum_{k=2}^{i} \frac{B_{\color{red}k}}{{\color{red}k}!}\frac{t^{{\color{red}ik}+1}}{({\color{red}ik}+1)!} \end{aligned}$$ are actually convolutions, with $k = 0$ and $k = 1$ excluded. So we can easily compute these values for all $i$, if we know the values of $B_{k}$. What is Bernoulli numbers? This wikipedia page explains a lot of knowledge, but what we need is only the values. Take a look at the definition part. The recursive definition says: $$B_m^+ = 1  \sum_{k=0}^{m1} \binom{m}{k} \frac{B^+_k}{m  k + 1}$$ where the $+$ part is not important since we are interested in $m \ge 2$. (see ) If we compute Bernoulli numbers $B_2, B_3, \cdots, B_k$ using this formula, the time complexity would be $O(k^2)$, but since the time limit is 6 seconds and this has really small constant, some people managed to squeeze in the time limit. However, for the editorialist's sake, we want something more faster. And the "generating function" part gives a nice formula: $$\frac{x}{e^x  1} = \sum_{m=0}^\infty \frac{B^{{}}_m x^m}{m!}$$ where, again, the $$ part is not important since we are interested in $m \ge 2$. This means, if we are able to compute the power series of the left hand side until $x^{k}$, we can easily compute $B_m$. Since we want to write LHS in power series form, let's do it right now: $$\begin{aligned} \frac{x}{e^{x}  1} & = \frac{x}{\sum_{i=0}^{\infty} \frac{x^i}{i!}  1} \\ & = \frac{x}{\sum_{i=1}^{\infty} \frac{x^i}{i!}} \\ & = \frac{1}{\sum_{i=1}^{\infty} \frac{x^{i1}}{i!}} \\ & = \frac{1}{\sum_{i=0}^{\infty} \frac{x^i}{(i+1)!}} \\ \end{aligned}$$ So the thing we need is to invert the power series. However, note that we only need to know the coefficients until $x^{k}$, which means we are interested in computing the power series of: $$\frac{1}{\sum_{i=0}^{\color{red}k} \frac{x^i}{(i+1)!}}$$ which is now finite. How should we deal with this? The idea was introduced 4 years ago in Codeforces Round #250 Div.1 E, and you can find a nice explanation at the "There is an interesting algorithm which calculate the inverse of a power series F(z)" part. Computing $R$It seems really hard at first sight, but note that we know how to compute the inverse of a power series as many terms as we want. In the world of power series, $$\frac{1}{1x} = 1 + x + x^2 + x^3 + \cdots$$ as most of us learned in algebra class. Also, $$\frac{1}{1ax} = 1 + ax + a^2x^2 + a^3x^3 + \cdots$$ Therefore, $$\begin{aligned} \sum_{n=1}^{N} \frac{1}{1a_{n}x} = & \space{} (1 + 1 + \cdots 1) \\ & + (a_1 + a_2 + \cdots + a_N)x \\ & + (a_1^2 + a_2^2 + \cdots + a_N^2)x^2 \\ & + (a_1^3 + a_2^3 + \cdots + a_N^3)x^3 \\ & + \cdots \end{aligned}$$ So if we are able to sum up those inverses efficiently, we can extract the coefficients and get the answer. How could we do that? First, write down what we want to compute. $$\sum_{n=1}^{N} \frac{1}{1a_{n}x} = \frac{(1a_{2}x)(1a_{2}x)(1a_{3}x)\cdots(1a_{N}x) + (1a_{1}x)(1a_{3}x)\cdots(1a_{N}x) + \cdots + (1a_{1}x)(1a_{2}x)(1a_{3}x)\cdots(1a_{N1}x)}{(1a_{1}x)(1a_{2}x)(1a_{3}x)\cdots(1a_{N}x)}$$ The denominator looks simple, so let's start from here. If we calculate this naively it takes $O(N^2)$ time. So let's try to narrow down the time complexity using Divide and conquer. It is really straightforward:
The time complexity of this algorithm satisfies $T(N) = 2T(N/2) + O(N \log N)$, which gives the solution $T(N) = O(N \log^2 N)$ by Master's Theorem. Now, it is really tempting to solve the compute the whole thing using divide and conquer too. Let's try:
The time complexity of this algorithm also satisfies $T(N) = 2T(N/2) + O(N \log N)$, which therefore gives the solution $T(N) = O(N \log^2 N)$ by Master's Theorem. Note that this algorithm only gives the numerator and denominator in a polynomial form, so you have to additionally divide the numerator by the denominator using the algorithm above, in $O(N \log N)$. In conclusion, we managed to do several convolutions to compute the answer. The only thing left is how to deal with modulo $10^9 + 7$. The most safe way is to do three NTTs using three different large FFT primes $\approx 10^9$, and then use Chinese Remainder Theorem to recover the answer. I believe this is not the important part and I'm in a rush, so I skip this part. Author's solution can be found here. Tester's solution can be found here. asked 23 May, 08:29

TL;DR : I need resources on how to multiply and divide polynomials; How to calculate the product of 2 polynomials modulo $10^9+7$? Is dividing polynomials even fast enough for these questions, or should I avoid having to divide polynomials as a rule of thumb? Are there methods better than the NewtonRaphson method? If not, then what should we take as the seed when using it? How many terms of each polynomial (numerator and denominator) should we take? I had a solution but couldn't code it up because I didn't know how to. Personally, I think my solution is simpler than the one provided so please take a look :) My 'solution': Note: I'll be assuming you need the first $O(K)$ terms of two polynomials to compute the first $K$ terms of their quotient. $g(T, k) = \displaystyle \sum_{i=0}^{T} f(i, k)$ $ = \displaystyle \sum_{i=0}^{T} \sum_{j=1}^{N} (a_j+i)^k $ So the exponential generating function for $g(T, k)$ : $G(x) = \displaystyle \sum_{k=0}^{\infty} g(T,k) \cdot \frac{x^k}{k!}$ $ = \displaystyle \sum_{k=0}^{\infty} \left( \sum_{i=0}^{T} \sum_{j=1}^{N} (a_j+i)^k \right) \cdot \frac{x^k}{k!} $ $ = \displaystyle \sum_{k=0}^{\infty} \sum_{i=0}^{T} \sum_{j=1}^{N} \left( (a_j+i)^k \cdot \frac{x^k}{k!} \right)$ $ = \displaystyle \sum_{i=0}^{T} \sum_{j=1}^{N} \left( \sum_{k=0}^{\infty} (a_j+i)^k \cdot \frac{x^k}{k!} \right)$ $ = \displaystyle \sum_{i=0}^{T} \sum_{j=1}^{N} e^{(a_j+i)x}$ $ = \displaystyle \sum_{i=0}^{T} \sum_{j=1}^{N} e^{a_jx} \cdot e^{ix} $ $ = \displaystyle \left(\sum_{j=1}^{N}e^{a_j x} \right) \cdot \left(\sum_{i=1}^{T}e^{i x} \right)$ $ G(x) = L(x) \cdot R(x) $ To compute $G(x)$ , compute the first $2^{\lceil\log(K+1)\rceil} = O(K)$ terms of $L(x)$ and $R(x)$ and multiply the using FFT. To calculate $R(x)$: $R(x)$ is the sum of a geometric progression. Using the formula, $R(x) = \displaystyle \frac{e^{(T+1)x}  1}{e^x  1}$ So we can calculate numerator polynomial and denominator polynomial and divide using FFT. To calculate $L(x)$: This algorithm is probably not the best way to go about this but this was the only thing I could think of. We will actually compute the ordinary generating function of $L(x)$ and get the e.g.f from that in $O(K)$ First, the o.g.f corresponding to $L(x)$ $ = \displaystyle \sum_{i=1}^{N} \sum_{k=0}^{\infty} (a_i \cdot x)^k = \sum_{i=1}^{N} \frac{1}{1a_i x} $ $ = \displaystyle \sum_{i=1}^{N} \left( 1 + \frac{a_i x}{1a_i x} \right) = N  x \sum_{i=1}^{N} \frac{a_i}{a_i x  1} $ From this we get the following algorithm: Compute $ \displaystyle P(x) = \prod_{i=1}^{N} (a_ix1)$ . This can be done in $O(N \log^2(N) )$ as discussed here In short, compute $\displaystyle \prod_{i=1}^{\left\lfloor\frac{N}{2}\right\rfloor} (a_ix1)$ and $\displaystyle \prod_{i=\left\lfloor\frac{N}{2}\right\rfloor+1}^{N} (a_ix1)$ recursively and multiply using FFT. Truncate or pad with $0$'s as necessary to get the required number of terms ( assumed to be $O(K)$ ). Take logarithm on both sides and differentiate to get: $ \displaystyle \log(P(x)) = \log \left( \prod_{i=1}^{N} (a_ix1) \right) = \sum_{i=1}^{N} \log(a_i x1) $ $ \displaystyle \frac{d}{dx} \log(P(x)) = \frac{d}{dx} \sum_{i=1}^{N} \log(a_i x1) = \sum_{i=1}^{N} \frac{d}{dx} \log(a_i x1) $ $ \displaystyle \frac{P'(x)}{P(x)} = \sum_{i=1}^{N} \frac{a_i}{a_i x  1}$ So the required o.g.f $\displaystyle = N  x \frac{P'(x)}{P(x)}$ Complexity: $O(K) + O(K)$ for precomputing factorials and inverses of factorials modulo $10^9+7$ $O(K) + O(K)$ for computing numerator and denominator of $R(x)$ $C(K)$ for computing $R(x)$ from numerator, denominator. ( $C(K)$ is the complexity for dividing two polynomials, which, I don't know ) $O(N\log^2(N) )$ for computing $P(x)$ $O(N)$ for computing $P'(x)$ from $P(x)$ $C(K)$ for computing the o.g.f $O(K)$ for computing $L(x)$ from the o.g.f $O(K \log(K))$ for computing $G(x)$ and finally, $O(K)$ to get $g(T,k)$ from $G(x)$ by multiplying $k^{th}$ term by $k!$ Overall complexity: $O(K + K\log(K) + N\log^2(N) + C(K))$ with high constants. I have no idea if it would fit into 6 seconds or not. NOW, GO BACK TO THE TOP TO MY TL;DR. answered 23 May, 10:37
1
You are a CS student. Use @mgch  See, there are beautiful uses of goto out there :D
(23 May, 10:52)
You can refer to the link I mentioned in the editorial: http://codeforces.com/blog/entry/12513 It seems links are hard to distinguish with links.
(23 May, 19:16)
About the modulo part: Wasn't my point, that we can do three NTTs using three large FFT primes, enough?
(23 May, 19:17)
Three primes (~10^9) for NNT is enough! I'm sure!
(23 May, 19:22)
@tncks0121 It wasn't clear while writing my comment but I thought on it for some time and now I understand. The method used in the codeforces blog you have mentioned for calculating the inverse of a power series is the NewtonRaphson method. I have some questions on its implementation. When I have calculated R(z) upto 2^i terms, exactly how many terms of F(z) do I need to calculate the next iteration of R? Also, the polynomial I get from this iteration has ~3x2^i terms so do I need to set some terms to 0 so that the resulting polynomial has 2^(i+1) terms?
(23 May, 21:01)
F is fixed, why do you care? The number of terms of the resulting polynomial is not important, the thing is if you had a polynomial of degree 2^i, after one iteration you are guaranteed that 2^(i+1) terms are correct. (You should get rid of all terms whose degree is larger than 2^(i+1))
(24 May, 06:49)
Finally got it correct! Thanks a lot for your help. @chemthan Very nice problem. I got to learn so much because of it. Was my expo. gen. func. solution the intended one? Because it simplifies a lot if you do that. @tncks0121 Thanks a lot for your explanations and the great editorial. P.S. If you happen to look at my code, ignore the 3 redundant NTTs, I'll be fixing that soon.
(25 May, 04:27)
showing 5 of 7
show all
