PROBLEM LINK:Author: Misha Chorniy DIFFICULTY:Super Hard PREREQUISITES:Segment trees, Range trees, Selfbalancing binary trees, 2D Fenwick trees, sqrt decomposition PROBLEM:Given an array $A$ consisting of $N$ positive integers, you have to process $Q$ queries. There are five types of queries:
QUICK EXPLANATION:
EXPLANATION:There are multiple parts to the solution to this problem. We first describe a solution for a simplified version of the problem, and then gradually support additional types of queries. No updatesWhen answering "range query" problems like this, it's sometimes helpful to ignore the updates first, i.e. try to answer the problem assuming there are no update queries. In our case, let's assume that there are only type 1 and 5 queries. It's clear that type 5 queries are easier, so let's handle them first. Given a subrange $[l,r]$ of an array $A$, how many distinct elements are there? The answer could be anywhere from $1$ to $rl+1$, depending on the duplicated elements in the subrange. We wish to count each distinct value only once, i.e. any value that appears again in the subrange must not be counted again. When is a particular value $A_i$ counted? Obviously, we must have $l \le i \le r$. But in order not to doublecount, we must ensure that this value is a new one, i.e. it hasn't appeared before. In other words, let $p_i$ be the largest index such that $p_i < i$ and $A_{p_i} = A_i$, or $p_i = 0$ if such a $p_i$ doesn't exist. Then it must be the case that $p_i < l$. Extending this to every element in the subrange, we see that the number of distinct elements in $A[l\ldots r]$ is the number of indices $i$ such that:
If for every $1 \le i \le N$, we plot a point $(i,p_i)$ in the 2D plane, then this is simply the number of points in the rectangle $[l,r]\times [0,l1]$. But this is the standard problem of 2D range queries, which are solvable with sqrtdecomposition or range trees. (we'll discuss these techniques in more detail later) This is great, because we can now answer type 5 queries! Now, what about type 1 queries? At first glance, the formula seems convoluted, and there seems no way to simplify the sum. However, upon closer inspection, we can actually understand that this is simply the sum of all products of triplets of distinct elements in the range. Such a sum can be simplified into the following: $$\frac{S_1^3  3S_1S_2 + 2S_3}{6}$$ where $S_i$ is the sum of the $i$th powers of all distinct elements in the range. To understand this, note that $S_1^3$ is the sum of products of all triples, but it includes every triple six times (once for each permutation), and includes some bad triples with duplicate values. The remaining terms are simply adjustments to accommodate for these bad triples. For example, the sum $S_1S_2$ is the sum of products of all triples containing at least one duplicated value, and $S_3$ is the sum of products of all triples containing a triplicate value. Therefore, to answer a type 1 query, we must be able to answer range queries of the type: "what is the sum of the $e$th powers of all distinct elements in the range?" where $e$ can be anything from $1$ to $3$. But we can modify the above approach to answer this! (in fact, the previous approach solves this in the special case $e = 0$) For a fixed $e$, we assign the value $A_i^e$ in each point $(i,p_i)$, and now the query becomes: compute the sum of values in the rectangle $[l,r]\times [0,l1]$. The techniques mentioned above will still work with simple modifications :) Point updatesNow let's tackle point updates, i.e. queries of type 2. Suppose we want to set $A_i$ to the value $v$, and that its previous value is $v_{\text{old}}$. To do so, we need to look deeper into the structure of the points $(i,p_i)$ in the 2D plane. Fix a distinct value $v$, and let $i_1 < i_2 < \ldots < i_k$ be the distinct indices $i$ such that $A_i = v$ (let's call it $v$'s index sequence). Then we have the following points: $(i_1,0), (i_2,i_1), (i_3,i_2), \ldots, (i_k,i_{k1})$. When setting the value of $A_i$ from $v_{\text{old}}$ to $v$, we need to change some points. First, we need to find the index of $i$ in the index sequence of $v_{\text{old}}$, say $i_r$. Then when the value of $A_{i_r}$ is replaced, the points $(i_r,i_{r1})$ and $(i_{r+1},i_r)$ will have to be removed. Furthermore, we also need to add the point $(i_{r+1},i_{r1})$. Now, when setting the value of $A_i$ to $v$, we need to find out where the index $i$ will be placed in the index sequence of $v$. Let's say $i_{r1} < i < i_r$ for some $r$. Then we need to remove the point $(i_r, i_{r1})$ and add the points $(i, i_{r1})$ and $(i_r, i)$. This means that there are only at most $6$ points that need to be updated: $3$ removed, $3$ added (maybe even less when the index $i$ is the first and last in the corresponding index sequence). Our sqrtdecomposition or 2D range tree approaches can be updated to accommodate these kinds of updates (which we'll also discuss later below). There's also the question of how to maintain the index sequences of all distinct values. But this can be done by maintaining a binary search tree for every distinct value. DeletionsNow, let's try handling deletions. Suppose we want to remove $A_i$ from the sequence. If we simply remove this value from the sequence, then the indices of all later elements of the sequence will need to be adjusted (for example $A_{i+1}$ becomes $A_i$, etc.). But this means updating up to $O(N)$ points, which would be too slow to pass the time limit. Thus, we need to find another way. The key idea is to simply not remove $A_i$ from the sequence, and instead replace it with a dummy "null" value. We consider this "null" value to be just like any other value, but the difference is that it does not have its own index sequence (and thus no points). Using this, we don't have to update the indices of the later elements of the sequence! Unfortunately, this means that the indices of range queries $[l,r]$ will need to be updated (because some of the indices are now null values). Specifically, each index $i$ needs to be replaced with a new index, which we denote by $f(i)$, which is the index of the $i$th nonnull value of the sequence. Fortunately, such an index can be computed by maintaining a segment tree structure for the null values. Specifically, we construct a tree with $N$ leaves, where the $i$th leaf contains a $1$ if $A_i$ is not null, and $0$ otherwise. Each internal node contains the sum of the values in the leaves in the subtree rooted at the node. Replacing the $i$th index of the sequence with a null (or nonnull) value corresponds to replacing the value of the $i$th leaf in this segment tree, and updating the corresponding internal nodes. To compute $f(i)$ given $i$, one only needs a single traversal down the tree to the appropriate leaf (utilizing the partial sums in the internal nodes). Both operations run in $O(\log N)$ time, which means this structure is not the bottleneck of our solution. InsertionsInsertions have the same problems as deletions when done naïvely: if we insert a value at a certain position, we need to update the indices of the later elements, and potentially update up to $O(N)$ points. The key is to implement the insertions so that:
For simplicity, instead of starting with an array of size $O(N)$, we can simply start with an empty array, and perform $O(N)$ insertions before the $Q$ queries (so now we have $N+Q$ queries, where the first $N$ are insertions). We preallocate an array of size $N+Q$, initially all null, and the key is to determine for each insertion the correct index where it should be placed so that future insertions can be accommodated without requiring a reallocation of the array. These correct indices can be precomputed by first simulating all insertions and deletions with a selfbalancing segment tree (for example, an AVL tree modified to become a segment tree), with the following guidelines:
This precomputation runs in $O((N+Q)\log(N+Q))$. After that, we simply allocate an array of size $N+Q$, initially all null, and no more reallocation is required during the processing of the actual queries! Rectangle queriesFinally, all that remains is to actually perform the rectangle queries efficiently. We'll clarify our task here: starting with a set of points (initially empty), we need to perform the following operations:
There are multiple (standard) approaches for this problem. 2D range tree Note that this can be done with an $O(N' \log N')$ preprocessing and $O(\log^2 N')$ query time (where $N' := O(N + Q)$ is the number of operations) by using a 2D range tree fitted for updates: Since a 2D range tree is just a tree of segment trees, we use segment trees that support point insertions/deletions and range sums. In each leaf of a segment tree, we store four values, $(1,v,v^2,v^3)$, and in each internal node, we store four values: the sums of the values of each node for every exponent. sqrt decomposition You can also use the all mighty sqrt decomposition by noticing that each inserted point $(x,y)$ satisfies $0 \le x, y \le N$. Therefore, we can instead think of the space $[0,N]\times [0,N]$ as a 2D matrix. Now, we try to decompose this matrix into blocks of dimensions $K\times K$, so that there are $N/K \times N/K$ blocks (for some parameter $K$ which we'll determine later). We also store, for each exponent $0\le e\le 3$, an $(N/K+1)\times (N/K+1)$ table of 2D partial sums: $S_e(i,j)$ is the sum of all values from the first $i$ rows and first $j$ columns of blocks. Such a table can be computed in $O(N + (N/K)^2)$ time. To implement the operations:
Unfortunately, $L$ can be up to $N+Q$, so an update can be very slow. The trick is to rebuild the table $S_e$ every $T$ operations (for some parameter $T$). The table can be rebuilt in $O(N + (N/K)^2)$ time, but the query operation optimizes to $O(K + T)$ time. Now all that remains is to choose the parameters $K$ and $T$. Note that there are $N' = N+Q$ queries, and after every $T$ insertions, we perform an $O(N+(N/K)^2)$ time "rebuild" operation. Therefore, the running time after $N'$ operations is: $$O\left(\frac{N'}{T}\left(N+\left(\frac{N}{K}\right)^2\right) + N'\left(K+T\right)\right) = O\left(N'\left(\frac{1}{T}\left(N+\left(\frac{N}{K}\right)^2\right) + K + T\right)\right)$$ It can be shown that the optimal choice for the values $K$ and $T$ is $K = T = O(\sqrt{N})$ (hence "sqrt decomposition") (why?), so that the running time is $O(N'\sqrt{N}) = O((N+Q)\sqrt{N})$. 2D Fenwick tree Finally you can also use a 2D Fenwick tree to perform the queries in $O(\log^2 N)$ time. The problem is that the memory requirement becomes $O(N^2)$. There are ways to reduce the memory requirement, for example using a hash map for every 1D Fenwick tree (but this has a rather large constant behind the $O$ notation). We leave it to the reader to discover the details on how to optimize this. If you need more hints, feel free to ask. Also, the author's solution uses this approach so you can check that solution too. NotesIf you have some questions, or would like some parts of this editorial in more detail, please feel free to ask in the comments below, and I'll try to accommodate your request :) Time Complexity:$O((N + Q) \sqrt{N})$ or $O((N + Q) \log^2 N)$ AUTHOR'S AND TESTER'S SOLUTIONS:
This question is marked "community wiki".
asked 23 Aug '15, 23:49

I would be nice if you can paste the code with comments and explanation answered 01 Sep '15, 06:28

Finally we have an editorial. It's gonna take a long time to understand. answered 27 Aug '15, 22:21
