PROBLEM LINK:
Author: Sergey Nagin
Tester: Istvan Nagy
Editorialist: Xellos
DIFFICULTY:
Mediumhard
PREREQUISITES:
Modular inverses
PROBLEM:
You’re given an array A. Find the number of its good permutations  in which \texttt{A[i]}+D > \texttt{A[i+1]} for all relevant i. Print this number after each of Q operations of the type "set \texttt{A[i] :=p}".
SHORT EXPLANATION:
The answer for one query can be found by counting the number of elements of A in certain intervals and multiplying those numbers. The intervals have length O(D), so it’s possible to recompute the number of elements in only O(D) of them when one element is changed. If we know how O(D) factors in the expression for the answer change, we can recompute the answer in O(D) per query (with precomputed modular inverses).
EXPLANATION:
solving one query
First of all, we should find some kind of formula for the answer  it needs to be something sufficiently simple, since we don’t have much time per query.
Let’s take some good permutation of A and look at the largest element \texttt{A[g]} in it (if there’s more than one, it doesn’t matter which). If \texttt{A[g]} isn’t the last element of \texttt{A}, the condition \texttt{A[g]}+D > \texttt{A[g+1]} holds automatically. In addition, if A[g] isn’t the first or last element of \texttt{A}, then \texttt{A[g1]}+D > \texttt{A[g+1]}. What it means is that if we remove \texttt{A[g]}, we’ll get a good array again. On the other hand, if we had a good array without \texttt{A[g]} and added \texttt{A[g]} to it at the start or after an element that’s > \texttt{A[g]}D, all conditions will be satisfied again (there can only be two new conditions with \texttt{A[g]} and they’ll both be satisfied) and we get a good array with \texttt{A[g]}.
This means that the number of good permutations containing the greatest element \texttt{A[g]} of \texttt{A} is the number of good permutations without that element, multiplied by the number of elements of \texttt{A} which are \ge \texttt{A[g]}D+1 and “smaller” than \texttt{A[g]}. If the elements of \texttt{A} were distinct, the answer would be the product numbers of elements of \texttt{A} in the ranges [\texttt{A[i]}D+1,\texttt{A[i]}) over all i.
We need to watch out for the order of elements in \texttt{A}  there can be equal elements, but we have to give them a unique order; “smaller” is meant as “earlier in this order”. We could view building a permutation as adding the elements of \texttt{A} in this order to an empty array, where each element goes at the beginning of the array or right after a sufficiently large element; for above described reasons, the permutation needs to be good after each step.
One way to deal with equal elements is using precomputed factorials: if there are x elements in the range [aD+1,a) and y elements \texttt{A[i]} equal to a, then we need to multiply the number of good permutations by (x+1)\cdot(x+2)\cdot..\cdot(x+y)=(x+y)!/x!  the first factor is for the “smallest” of elements equal to a, the second term is for the “second smallest” one etc. Note that if y=0, this product is equal to 1, which means there’s no effect on the answer. This way, we don’t have to focus on numbers which actually appear in \texttt{A} and just take it over the fixed set S_I of all numbers on the input, where S_I=O(N+M). It even works for K=1, when x=0 all the time (with 0!=1).
If we note the number of occurences of each number a \in S_I in an array \texttt{occ} (of course, it’s necessary to first read the full input and compute this set, and also to map the numbers from the input to small enough integers in the same order; the inverse mapping can also be stored in an array \texttt{val}, i.e. \texttt{val[i]} occurs \texttt{occ[i]} times in \texttt{A}), then ys are elements of \texttt{occ} and xs can be computed in O(1) time using prefix sums of \texttt{occ}, so the naive time per query is O(N+M).
solving all queries
Let’s look at a replacement operation as two operations, removing an element of \texttt{A} and adding a new one. Naturally, the values in \texttt{occ} can be updated in O(1); specifically, it’s decrementing or incrementing one element. We need to focus on recomputing xs quickly.
The key optimisation is that since xs are taken from small enough intervals, not many of them will change if only one element of \texttt{occ} changes. In particular, when incrementing \texttt{occ[i]}, we have to increment all x(j) such that j > i (so \texttt{val[j]} > \texttt{val[i]}) and \texttt{val[j]} \le \texttt{val[i]}+K1. And since \texttt{val[j]} \ge \texttt{val[i]}+ji, we only need to do that for j \le i+K.
That makes O(K) values of x to update. How to update the answers when some x changes? We need to deal with dividing by factorials; there are two options:

use the fact that it’s just incrementing and \frac{(x+1+y)!}{(x+1)!}=\frac{(x+y)!}{x!}\frac{x+1+y}{x+1}, so we need to multiply the answer by \frac{x+1+y}{x+1} (a similar formula for decrementing x is left as an easy excercise); since x+1 is positive and much smaller than the prime modulo, we can multiply by its modular inverse (link, we can precompute them first) instead of dividing

just precompute enough factorials (up to N! is enough) and their modular inverses, then divide the answer by (x+y)!/x! and multiply by (x+1+y)!/(x+1)!
In total, we have O(KM) operations when processing queries, some O((N+M)\log{(N+M)}) compression of values on the input into \texttt{val} and \texttt{occ}, and O(N\log{(mod)}) precomputation; the limiting factor in the total time complexity is O(KM), which seems rather big, but since there are few actual operations, it has a good enough constant and fits into the time limit. Memory: O(N+M).