### PROBLEM LINK:

**Author:** Misha Chorniy

**Tester:** Kevin Atienza

**Translators:** Sergey Kulik (Russian), Team VNOI (Vietnamese) and Hu Zecong (Mandarin)

**Editorialist:** Kevin Atienza

### DIFFICULTY:

Hard

### PREREQUISITES:

Range trees, segment trees, functional data structures, persistent segment trees

### PROBLEM:

You are given an array A[1\ldots N] of positive integers. You have to answer Q queries on it of the following type:

- Given l, r and k, let S denote the sorted list of
**distinct**elements of A[l\ldots r]. What is the $k$th number in this list? If it doesnâ€™t exist, output -1.

Additionally, each query partly depends on the answer to the previous query, so youâ€™re forced to answer the queries in the given order (except in subtask 3).

### QUICK EXPLANATION:

First, normalize and compress the values of A[1\ldots N], so they are in \{0, 1, \ldots, N-1\}.

The answer for a query is equivalent to **the smallest V such that A[l\ldots r] has at least k distinct values**, so we can find it using binary search. We just need to answer the following question:

**Given l, r, V, how many distinct elements in A[l\ldots r] are \le V**? This can be turned into a geometry query. First, we define N points \{(x_i,y_i,z_i) : 1 \le i \le N\}, where (x_i,y_i,z_i) is defined as follows: Suppose A* = v. Also, let j be the index of the previous occurrence of v, i.e. A[j] = v and j < i is the largest such value. If no such j exists, set j = 0. Then (x_i,y_i,z_i) = (v,j,i).

The answer to the query above is now the number of points (x_i,y_i,z_i) such that x_i \le V, y_i < l and l \le z_i \le r. Thus, we now have standard geometric range queries! Using **3D range trees**, we can preprocess the points in O(N \log^3 N) time and each query can be answered in O(\log^3 N) time, but with binary search it becomes O(\log^4 N) which might be too slow. Instead, we can use **fractional cascading** or **functional/persistent segment trees** to reduce these to O(N \log^2 N) and O(\log^3 N), respectively, and with some clever tree traversals we can even reduce query time to O(\log^2 N).

### EXPLANATION:

# Subtask 1

In the first subtask, we have Q imes N \le 10^7, so solutions that answer each query in O(N) time, or even O(N \log N) time, is acceptable. In this case, the straightforward approach works:

- Compute the sorted version of A[l\ldots r].
- Remove the duplicates.
- Find the $k$th number in the list, or if the list has less than k elements, output -1.

This takes O(N \log N) time in the worst case, so doing this Q times takes O(QN\log N) time. With a reasonable implementation, this will pass the time limit of the first subtask.

# Subtask 2

In the second subtask, we no longer have Q imes N \le 10^7, but instead we have the additional restriction k = 1. It means that we are finding the first element in the sorted list of distinct elements of A[l\ldots r]. Now, regardless of whether there are duplicates or not in A[l\ldots r], this is clearly the same as the **minimum element in A[l\ldots r]**. In other words, the query reduces to a standard **range minimum query**, which has lots of possible solutions! Two fast approaches to this are the following:

**Approach 1**: Construct a **segment tree** on top of the array, so that each node represents some subarray of A and also contains the minimum among all elements in this subarray. Constructing this tree takes O(N) time, and afterwards, each range minimum query can be answered in O(\log N) time.

**Approach 2**: Construct a **table of minimums**, T[1\ldots N][0 \ldots \log_2 N], where T*[k] is the minimum element in the subarray A[i\ldots i+2^k-1]. This table can be constructed in O(N \log N) time using dynamic programming. Afterwards, each range minimum query can be answered in O(1) time: The minimum value in A[l\ldots r] is \min(T[l][k], T[r-2^k+1][k]), where 2^k is the largest power of two \le r - l + 1.

Asymptotically faster solutions exist, but they are much complicated, and in any case, this solution doesnâ€™t apply to other subtasks, so there isnâ€™t much use in them. You can read about faster solutions here.

# Subtask 3 & 4

Now, letâ€™s move on to answering the hardest subtask, which is the original problem without any additional restrictions. Note that subtask 3 is just an **offline version** of the problem, which may permit some simpler solutions, but we wonâ€™t discuss them here.

## Binary search

The first step is realizing that a query can be converted into a form that is suitable for **binary search**. Each query is equivalent to the following:

**Find the lowest value V such that the number of distinct elements of A[l\ldots r] that are \le V is at least k.**

This is certainly solvable with binary search for the value V, as long as we can answer the following question efficiently:

**Given l, r and V, how many distinct elements of A[l\ldots r] are \le V?**

So letâ€™s focus on this query, which now looks more like a **geometric range query**. Unfortunately, the word â€śdistinctâ€ť prevents us from converting this problem directly into a geometric query. Without â€śdistinctâ€ť, one fast solution for this problem would be:

- For preprocessing: for each 1 \le i \le N, add the point (x_i,y_i) = (A*,i).
- For each query (l,r,V): count the number of points in the rectangle [0,V] imes [l,r].

Thus we have reduced each query to 2D geometric queries, for which there are some well-known solutions, for example using **2D range trees**.

But since we only need to count distinct elements, we need to be more creative with our query. The solution above counts duplicate entries multiple times, so we must find a solution that doesnâ€™t do that. One way is to only count the *first occurrence* of each distinct value in the range A[l\ldots r].

How do we accomplish this? Well, for a particular value v, suppose A* = v for some l \le i \le r. For A* to be the first occurrence of v in A, the following additional condition must hold true:

**If j is the largest index < i such that A[j] = v, then j < l.** (If A* is the first occurrence of v in the *whole* array, we can set j = 0.)

Thus, we can now refine our geometric query above to the following:

- For preprocessing: for each 1 \le i \le N, let j be the largest index < i such that A[j] = A*, or j = 0 if no such index exists. Then add the point (x_i,y_i,z_i) = (A*,j,i).
- For each query (l,r,V): count the number of points in the 3D â€śrectangleâ€ť [0,V] imes [0,l-1] imes [l,r].

Now we have properly converted the query into a 3D geometric range query!

Another thing: due to the binary search, whatever complexity we get in solving this problem will get multiplied by O(\log A_{\max}) time. We can reduce this to just O(\log N) by **compressing the values of A to the range [0\ldots N-1]**. This preprocessing step can be done in O(N \log N) time, and when printing the answer we just need to remember to revert back to the original list of values.

## Geometric queries

Our goal is to answer 3D range queries. Specifically, given N points \{(x_i,y_i,z_i) : 1 \le i \le N \}, answer the following kind of query:

**How many points are in the range [a,b] imes [c,d] imes [e,f]?**

The usual way of solving k-dimensional range queries is with k-dimensional **range trees**. You can think of a 1D range tree as a simple segment tree, and a k-dimensional range tree as a balanced tree of (k-1)-dimensional range trees. Such a structure on N points can be constructed in O(N \log^k N) time, and each k-dimensional â€śrectangleâ€ť query can be answered in O(\log^k N) time, by using the usual segment tree recursion at each dimension of the tree.

The following is a pseudocode of a 2D range tree:

```
// 1D tree
// this is just a normal segment tree
class Tree1D:
constructor(Y, i, j):
// Y is the sorted list of values
// Y[i..j] is the relevant range
this.i = Y*
this.j = Y[j]
if i == j:
// this is a leaf
this.count = 1
this.l = this.r = NULL
else:
k = (i + j) / 2
this.l = new Tree1D(Y, i, k)
this.r = new Tree1D(Y, k+1, j)
this.count = this.l.count + this.r.count
range_query(a, b):
// count the number of elements in range [a,b]
if a <= this.i and this.j <= b:
return this.count
else if this.j < a or b < this.i:
return 0
else:
return this.l.range_query(a, b) + this.r.range_query(a, b)
// 2D tree
class Tree2D:
constructor(P, i, j):
// P is the list of points, sorted on "x"
// P[i..j] is the relevant range
this.i = P*.x
this.j = P[j].x
Y = sorted list of elements of [P*.y, P[i+1].y, P[i+2].y, ..., P[j].y]
this.tree = new Tree1D(Y, 0, Y.length-1)
if i == j:
this.l = this.r = NULL
else:
k = (i + j) / 2
this.l = new Tree2D(P, i, k)
this.r = new Tree2D(P, k, j)
range_query(a, b, c, d):
// count the number of points in the rectangle [a,b] x [c,d]
if a <= this.i and this.j <= b:
return this.tree.range_query(c, d)
else if this.j < a or b < this.i:
return 0
else:
return this.l.range_query(a, b, c, d) + this.r.range_query(a, b, c, d)
```

Notice that both classes have very similar implementations, but each node for Tree2D contains a Tree1D which handles the â€śyâ€ť values. You can probably figure out how to construct 3D (or higher-dimensional) range trees from this.

Also, this construction takes O(N \log^2 N) time because of the sorting. Afterwards, each `range_query`

call runs in O(\log^2 N) time.

Unfortunately, in our case, we will need 3D range tree, and each query takes O(\log^3 N) time. And because weâ€™re doing a binary search on top of that, each query for the original problem takes O(\log^4 N) time, which is probably too slow to pass the time limit.

Instead, we will describe two techniques to improve the running time of each 3D range query.

## Technique 1: Fractional cascading

We can use **fractional cascading** to reduce a 3D range query from O(\log^3 N) to O(\log^2 N). We can illustrate this idea using 2D range trees.

We can think of a 2D range tree as a segment tree on the x coordinate, where each node also contains a segment tree, this time on the y coordinate. The first step is to replace the segment trees on y with simple **sorted arrays**, and replace range queries with simple *binary searches*.

Now this doesnâ€™t reduce the asymptotic running time, but it helps reduce the complexity of our structure. For example, we can now construct a 2D range tree in O(N \log N) time instead of O(N \log^2 N) by using a merge-sort-like algorithm; Suppose we want to construct a 2D range tree over the points \{(x_1, y_1), (x_2, y_2), \ldots, (x_N, y_N)\}, and suppose this is sorted by x and then by y. Then:

```
class RangeTree2D:
constructor(i, j, xi, xj, y_list, left, right):
this.i = i
this.j = j
this.xi = xi
this.xj = xj
this.y_list = y_list
this.left = left
this.right = right
def construct(P, i, j):
// constructs the the tree from the i'th to the j'th point
if i == j:
return new RangeTree2D(i, i, P*.x, P*.x, [P*.y], NULL, NULL)
else:
k = (i + j) / 2
left = construct(P, i, k)
right = construct(P, k+1, j)
y_list = merge(left.y_list, right.y_list) // merge sort's "merge"
return new RangeTree2D(i, j, P*.x, P[j].x, y_list, left, right)
```

Notice that instead of sorting `y_list`

per node, we just â€śmergeâ€ť the `y_list`

s of the two children in linear time, because theyâ€™re already sorted.

Now, here comes the â€śfractional cascadingâ€ť part. The reason each range query runs in O(\log^2 N) time is that we perform a binary search over every node we visit. But we can speed this up by noticing that we are binary searching for the same value over and over again, and that we are only visiting nodes in a single path from the root to the leaf! The idea is to use this fact so that after a single binary search at the root, **we donâ€™t have to perform another binary search again**.

We do this by setting up lots of pointers. For each node, and for each i from 0 to `y_list.length-1`

, we keep track of two values.

- The largest index j such that
`left.y_list[j] <= y_list*`

. - The largest index j such that
`right.y_list[j] <= y_list*`

.

Using these pointers, once we binary search for a value at the root, we donâ€™t have to binary search for this value at the childâ€”we just follow the pointer! This allows us to reduce range query complexity to O(\log N).

Now, I admit that I skimmed over some details, but I hope you at least get the idea behind.

By performing this technique for every 2D range tree inside our 3D range tree, our range query reduces now to O(\log^2 N).

## Technique 2: Functional data structures

To improve the running time, we will try to use a different approach, via **functional data structures**. Functional data structures are data structures that are implemented under the **functional programming** paradigm. One of the key characteristics of functional programming is that functions donâ€™t have â€śside effectsâ€ť, and variables donâ€™t change once assigned. (In fact, the term â€śvariableâ€ť becomes a misnomer.) Data structures built under this paradigm are consequently **persistent**, i.e., **older versions of the data structure are still fully available.**

To write our structures under this paradigm, we need only ensure that we keep our functions from having side effects, like changing some global variables or attributes. For example, to implement a functional 1D segment tree, instead of updating the attributes when we perform an update, we simply **fully copy the nodes that have been updated**. On an update, this makes us copy O(\log N) nodes, but the upside is that **we still have the old segment tree fully available.** Note that the running time of an update and query is still O(\log N)!

Once again, weâ€™ll illustrate with 2D range trees. Suppose we want to make a 2D range tree over \{(x_1, y_1), (x_2, y_2), \ldots, (x_N, y_N)\}, and assume this is sorted in x, and then in y.

Each 2D range query counts the number of points some rectangle [x_1,x_2] imes [y_1,y_2]. If all coordinates are nonnegative, this is equal to the number of points in [0,x_2] imes [y_1,y_2] minus the number of points in [0,x_1-1] imes[y_1,y_2], so we can just assume that x_1 = 0.

If the queries were **offline**, i.e. we know all queries beforehand, then we can answer all query rectangles in increasing order of x_2. The way we do this is to maintain a segment tree on the $y$s, initially empty. First, we sort all queries and points according to x, and then process them in that order:

- For every point (x_i,y_i), add y_i to the segment tree.
- For every query rectangle [0,x_2] imes [y_1,y_2], count the number of values in the segment tree in the range [y_1,y_2].

In other words, weâ€™re effectively turning the x coordinate into a â€śtimeâ€ť dimension. Itâ€™s not hard to see why this works.

Unfortunately, the queries are not offline, because some queries are dependent on the result of previous queries (note that weâ€™re performing binary search). The solution to this is to use a functional/persistent segment tree to construct all possible versions of the segment tree, by inserting the (x_i,y_i) points in order. Then, for a query rectangle [0,x_2] imes [y_1,y_2], we simply find the version of the segment tree corresponding to x_2, and query that segment tree!

Using this approach for our 2D range tree makes each query O(\log N), and for a 3D range tree, each range query O(\log^2 N).

## Refined â€śbinary searchâ€ť

The techniques above allow us to reduce the running time of a 3D range query from O(\log^3 N) to O(\log^2 N), but with binary search, our running time is still O(\log^3 N), which might still be too slow. Our goal is to reduce this further.

In fact, we can do this by performing a more clever version of our binary search! The key insight is that **we are binary searching over the first coordinate, x**: Remember that we are finding the lowest value V such that [0,V] imes [0,l-1] imes [l,r] has at least k points.

Our original approach is the following:

```
// tree3d is our 3d range tree
def find_answer(l,r,k):
V_low = -1
V_high = N
while V_high - V_low > 1:
V_mid = (V_low + V_high) / 2
count = tree3d.range_query_3d(0,V_mid, 0,l-1, l,r)
if count >= k:
V_high = V_mid
else:
V_low = V_mid
return V_high
```

which is just a normal binary search. However, we can use the structure itself as our guide for binary search! Hereâ€™s an illustration:

```
// here, ".tree2d" is the underlying 2d range tree of a 3d range tree
def find_answer(l,r,k):
curr = tree3d
while curr.xi != curr.xj:
count = curr.left.tree2d.range_query_2d(0,l-1, l,r)
if count >= k:
curr = curr.left
else:
curr = curr.right
k -= count
return curr.xi
```

To help you understand this better, this is the same sort of technique that allows us to find the $k$th smallest value in a normal binary search tree, as long as we keep track of the size of each subtree at each node.

Since each 2D range query runs in O(\log N) time, and we perform O(\log N) of those queries, the overall running time reduces to O(\log^2 N) time, which is good enough!

### Time Complexity:

O((N + Q)\log^2 N)