PROBLEM LINKSAuthor: Constantine Sokol Note: All computations in this editorial are modulo M unless explicitly specified. DIFFICULTY:easymedium PREREQUISITES:Basic Number Theory, Modular Arithmetic, state transitions using matrices, Fast Modular Exponentiation, inclusionexclusion principle, Modular Multiplication using just addition (for subtask 4) HINTConsider the (K+2) length column vectors Using this matrix M we can find 1+2^{K}+3^{K}+...+N^{K} in O(K^{3} * log N) time. But this is not what we want we just want the summation of Kth powers of numbers which are co prime to N. Using prime factorization of N and by using inclusion exclusion principle along with the matrix multiplication we established earlier we can find the desired quantity. EXPLANATIONSubtask 1 and 2:
A simple algorithm will pass these two sub tasks.
output sum For subtask2 if you store all your values in 32bit integers you are likely to run into overflow issues and hence will output wrong values. To be on safe side store all your values in 64bit integers. Subtask 3 (solves subtask1 and subtask2 also) for r < K+1 this is simply (n+1)^{r} = R[0]*1+R[1]*n+R[2]*n^{r}+...+R[r]*n^{r} +....+R[K]*n^{K} + R[K+1]*(1+2^{K}+...+n^{K}) From the binomial theorem the left hand side is (1+n)^{r} = C(r,0)1 + C(r,1)n + C(r,2)n^{2} + … + C(r,r)n^{r} So by simply comparing the coefficients we have R[i] = C(r,i) for i<=r and R[i] = 0 for i > r for r = K+1 we have 1+2^{K}+...+n^{K}+(n+1)^{K} = R[0]*1+R[1]*n+R[2]*n^{2}+...+R[r]*n^{r} +....+R[K]*n^{K} + R[K+1]*(1+2^{K}+...+n^{K}) Setting R[K+1] = 1 we have (n+1)^{K}= R[0]*1+R[1]*n+R[2]*n^{2}+...+R[r]*n^{r} +....+R[K]*n^{K} which is pretty much same as the sum for the previous row. Hence we can take all these coefficients from the previous row. So we have completed our matrix M now. Let us have a look at our matrix for K=4
Note that the first K+1 rows are just the pascal triangle. This matrix can be computed in O(K^{2}) time. Our initial vector V will be [1, 0, 0, …, 0]^{T} If we want 1^{K}+2^{K}+...+N^{K} all we need is to find the last entry of the vector V’ = (M^{N})*V Since V is all 0’s except for a 1 at the first position the value is simply the first entry in the last row of (M^{N}) You can compute (M^{N}) in O(K^{3} * log N) using fast exponentiation. If you are not aware of these techniques have a look at http://en.wikipedia.org/wiki/Exponentiation_by_squaring and http://en.wikipedia.org/wiki/Modular_exponentiation#Righttoleft_binary_method. This is job almost half done. let us define duperFunction(N, K) = sum (i^{K}) over all i <= N and gcd(i,N) != 1 We have superFunction(N, K) = (1^{K} + 2^{K} + ….. + N^{K})  duperFunction(N,K) Now let us look what duperFunction(N,K) looks like if N is a prime then duperFunction(N,K) = N^{K} as every other number < N will have gcd of 1 with N. if N is a product of two primes p1, p2 then all numbers which are divisible by p1 or p2 or both will have gcd greater than 1 with N. All the numbers which are <= N and divisible by p1 are p1, 2*p1, 3*p1,...floor(N/p1)*p1 All the numbers which are <= N and divisible by p2 are p2, 2*p2, 3*p2,...floor(N/p2)*p2 But if we raise all these to the Kth power and add them up this will not give us the desired value of duperFunction because few numbers like p1*p2 will be present in both the lists. We can use inclusion exclusion principle here to find our duperFunction. duperFunction(N,K) = f(p1) + f(p2)  f(p1*p2) where f(X) = X^{K} + (2X)^{K} + … + (floor(N/X)X)^{K} = X^{K}(1 + 2^{K} + … + (floor(N/X))^{K}) We can compute X^{K} using fast exponentiation and we already know how to compute the second part of the product. Hence we can compute the duperFunction and hence the superFunction. Since the inclusion exclusion principle extends to sets of arbitrary size we can simply compute the set of all prime factors of N and use inclusion exclusion principle on this set in a similar fashion we did for two primes.
Check http://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle if you have not too familiar with inclusion exclusion principle
if the set of primes is p1,p2,p3,..,ps then You could compute above summation by simply iterating over all subsets of { p1,p2,p3,..,ps}. Since N is atmax 10^{12} we will not have more than 11 prime factors and hence you will need to evaluate the function f for max 2^{11} values. You can compute the prime factors of N in O(sqrt(N)) time. So the time complexity of the solution will be O(sqrt(N) + 2^{s} * K^{3} * log N) Since the value of M is 10^{9} at max if are careful enough using 64bit integer values you will never overflow. Subtask4: (solves all the subtasks) Though the constraint on K is little bit higher it looks like the previous solution solves this subtask as well. But it will most likely not for the following reason. Since you no longer use primitive multiplication the overhead of our custom multiplication is pretty high as we perform it as many as O(2^{s} * K^{3} * log N) times. So we will optimize the solution a bit further When we compute M we also compute iteratively M^{2}, M^{4}, M^{8}, …., M^{(240)} and keep them stored. Since 2^{40} > 10^{12} we need powers of two only till here. Recall our vector matrix definition once again. V’ = (M^{N}) * V and V’[K+1] gives the value for (1^{K}+...+N^{K}); If you can decompose M^{N} into product of the form M^{(2i)} (which we have precomputed) then due to associativity of matrix multiplication we can just multiply the vector with one of the matrices and get a new vector and keep multiplying this vector with the rest of the matrices to get a new vector each time to ultimately get our V’. A small example if you want to compute (M^{13})*V you can compute it in the following way (M^{13})*V = (M^{8}) * (M^{4}) * (M^{1})*V = M^{8} * (M^{4} * (M*V)) Since there can logN such decompositions and matrix vector multiplication can be done in O(K^{2}), we can perform this task in O(2^{s} * K^{2} * log N). The overall time complexity becomes O(K^{3} * log N + sqrt(N) + 2^{11} * K^{2} * log N). The first K^{3} * logN factor is for precomputing. Note: You could actually compute (1^{K} + 2^{K} + 3^{K} + … + N^{K}) using the formulas given at http://mathworld.wolfram.com/PowerSum.html But since our M is not a prime number and the formula involves division it will be pretty tedious to compute these formulas modulo M. But for K<=3 you can easily use these formulas. So you can solve subtask3 without actually using our Matrix Multiplication method. SOLUTIONSSetter's Solution: SFUNC.cpp
This question is marked "community wiki".
asked 29 Dec '13, 15:38

awesome editorial!!!