MANYLIST - Editorial



Author: Sergey Kulik
Tester: Yanpei Liu
Editorialist: Pawel Kacprzak




Segment tree, sets


Your task is to maintain a list of sets of integers numbered from $1$ to $N$. All sets are initially empty. You have to handle $M$ operations, each of one of the following two types:
  1. exttt{INSERT(L, R, X)}: inserts integer X to all sets numbered from L to R
  2. exttt{QUERY(Q)}: outputs the size of the set numbered Q


For a partial credit, you can pretty easy use the most straightforward approach by just following the process described in the statement. In order to get full credit, for each inserted element $e$, you can use set data structure to store the ranges of sets having $e$. In order to get the number of elements in each set, you can use segment / interval tree to maintain these values.


##Simple solution
If $N$ and $M$ are quite small, like in the first two subtasks, where $N, M \leq 1000$, the problem is very easy to solve.

We can store an array A of N sets, where A* represents a set numbered i. Since all integers, which can be inserted to sets are within the range [1..N], we can store each set as an array, i.e. A*[j] = 1 if and only if integer j is in the set numbered i.

In order to handle a single insert operation, we can iterate through sets numbered from L to R, and insert the requested X into all of them. This operation takes O(N) time.

Handling a query operation is obvious, we just count the number of 1's in the array representation of the requested set.

Since there are M operations to perform, we can achieve O(N \cdot M) time complexity to handle all of them.

The total time complexity of this method is O(N^2 + N \cdot M).

Faster solution

However the above method is sufficient for small inputs, we need a better one to get a full credit here.

Rather than storying, for a fixed set, the information what elements are in this set, we can think the other way around.

Let S[e] be the ordered list of disjoint ranges of sets having element e. For example, if a range [L, R] is in S[e], it means that e belongs to all sets numbered from L to R. We say that a range [A, B] is before a range [C, D] in our order if and only if A < C or A = C and B < D. The disjoint property is very important in the definition, we will use it as the invariant. We are going to implement S[e] as a balanced binary tree in order to be able to handle insert queries. If you are using for example exttt{C++} you can use exttt{std::set< pair < int, int > >} to implement a single S[e].

The second data structure that we are going to use is a segment tree T. In this tree, we will store the information on how many elements are in sets numbered from some L to R. We will use it to update the number of elements in a range of sets and to get the number of elements in a particular set.

Initially, both structures are empty. This means that S[e] is empty for all valid e and all nodes in T are initialized to have 0 elements.

Now, we are going to describe, how we handle both operations. We will start with insert, because it exposes the nature of the problem in a great manner.

Insert operation

Suppose that we are going to insert element $X$ to all sets numbered from $L$ to $R$. The idea of the solution is to find in $S[X]$ the set $Y$ of all overlaping ranges with $[L, R]$. After inserting $X$ to sets numbered from $L$ to $R$, all sets in a range $Y \bigcup [L, R]$ will have element $X$. What we are going to do, is to replace all ranges from $Y$ in $S[X]$ with a single range representing the just mentioned sum of ranges.

In order to do that, for which range A \in Y, we decrease the number of elements in a range A \bigcap [L, R] by one, and delete A from S[X]. At the end, we increase the number of elements in [L, R] by one, and we insert the range Y \bigcup [L, R] to S[X]. Notice that we maintain the number of elements of sets in a range using our segment tree T.

Let’s now explain some details on how to implement the logic described above.

In order to get the set Y, first we find in S[X] the leftmost range of sets which overlaps with [L, R]. Do you remember that we implemented S[X] as a balanced binary tree and that all ranges in that tree are disjoint? Based on these two facts, we can quickly find the leftmost range which overlaps with [L, R], in exttt{C++} you can use exttt{lower\_bound} method of exttt{std::set} to achieve that. After that, we can iterate over all ranges overlapping with [L, R] using exttt{successor} operation on S[X].

All operations of increasing and decreasing the number of elements in sets of a given range are implemented on segment tree in a standard way.

You may wonder what is the complexity of the insert operation. Well, you can notice that each segment is inserted exactly once and erased at most once. Since a single insert / delete / lower_bound / successor operation on S[X] takes O(\log(n)) time and update operation on T also takes O(\log(N)) time, a single insert operation has O(\log(N)) time complexity.

Query operation

Now, the easier operation, we want to know, how many elements are in the set numbered $Q$. Because we maintain our segment tree $T$, we can answer each such query in $O(\log(N))$ time easily by summing up accumulated values at the path form a leaf corresponding to the set $Q$ to the root of $T$.

Time complexity

The total time complexity of this solution is O(N + M \cdot \log(N)) and this is a very nice speed up over the simple solution.


Author’s solution can be found here.
Tester’s solution can be found here.


1 Like

author’s and tester’s solution are not of this problem :confused:

tester’s sol :
Setter’s sol :

1 Like

Can’t understand the insertion part properly. If anyone explain this part, it will be a great help.

The practice link and the contest link are of different problem

I think for the first two subtask…we can have a sum[] array …we increment the sum* if j th element is newly added to the i th set…by this we reduce the query operation from O(n*n) to O(1)

You probably mean to reduce a single query time from O(n) to O(1), if so, you are right, but a single update still takes O(n) it this method, so this is not so much improvement. I had considered describing this method in the editorial, but since it doesn’t improve the total complexity, I didn’t do that.