union-find algorithms, disjoint-set forests, offline algorithms
Given an N × M grid, you need to support Q queries of the following four kinds:
- 1 x y - build the wall between cells (x, y) and (x, y+1). If there is already exist a wall between them, this query is ignored.
- 2 x y - build the wall between cells (x, y) and (x+1, y). If there is already exist a wall between them, this query is ignored.
- 3 x1 y1 x2 y2 - check if cells (x1, y1) and (x2, y2) are connected.
- 4 - You must answer the size of the largest connected component. A connected component is a set of cells such that every pair of cells in it are connected. The size of a connected component is a number of the cells in it.
- Collect all Q queries in an array.
- Uniquify the 1 x y and 2 x y queries, i.e. only keep the first occurrence of a given 1 x y/2 x y.
- Process the remaining queries in reverse, starting with a grid with some walls already existing, and then removing them and merging connected components. Use a really fast union-find structure for this, such as disjoint-set forests. Also, keep track of the sizes of each component, and the largest among them.
- The sum of the results is the answer.
First of all, one can simply implement the given queries in a straightforward way. We only need to keep track of which walls have been added. Let’s see how they can be done:
- 1 x y and 2 x y - simply set a flag that there is a wall between two given cells.
- 3 x1 y1 x2 y2 - Start a breadth-first search/depth-first search from (x1,y1), and then check whether the cell (x2,y2) is visited or not.
- 4 - Start a breadth-first search/depth-first search from all components, and output the largest one.
Type 1 and 2 queries can be done in O(1) time each, while type 3 and 4 queries the solution may takes O(NM) time in the worst case. Therefore, the solution runs in O(QNM) time overall. So a reasonable implementation can pass the first subtask, but will fail the others.
If you are familiar with union-find algorithms, you might notice that this is somehow its reverse problem, where instead of adding edges, you remove them. There are popular and fast ways of doing implementing union-find, but it seems that there is no similar way to solve the reverse problem.
This problem graphs is called decremental connectivity, and common techniques for solving it efficiency is very complex, and might not even pass the tight time limits of this problem. It would really, really be nice if we can solve the reverse problem.
In fact, why not simply reverse the queries? There’s nothing stopping us from doing so, because all the queries are known in advance! The idea is simply to add all the walls from the queries, and then reverse the queries, but then we proceed removing walls instead of adding them. Now, we can use efficient union-find algorithms!
We will need to describe the details. First, the type 3 queries simply involve two find queries and checking whether they are equal. For type 4 queries, we will need to be able to access the size of the largest connected component so far. To do so, we need to augment the union-find structure to keep track of the sizes of each component. This is simple: you really just need to store the size of the component on each representative, and when merging two components, simply update this size as the sum of the sizes of the components to be combined. This can be done with no significant cost to the running time. Then to find the largest one, we simply need one variable to keep track of it, and update it when we merge components. This works because the largest component size is nondecreasing. (As a side note, suppose we were not processing the queries in reverse order, but are proceeding with decremental connectivity. We can’t use a single variable any more. Instead, we will need a priority queue containing the sizes of all components, and update it whenever we split a component.)
Finally, type 1 and 2 queries are simply union operations. However, remember that there can be duplicate 1 x y/2 x y queries! In those cases, only the first occurrence of each such distinct query is important, because the others are no ops. Therefore, find the duplicate queries and ignore them during the reverse processing. This can be done in O(NM) time at the beginning.
We can now see that the whole algorithm has an O(NM) preprocessing step, and O(\alpha(NM)) time per query, where \alpha is the inverse Ackermann function, so the whole algorithm runs in O(NM + Q\alpha(NM)) time. The \alpha comes from implementing union-find using disjoint set forests, with path compression and union by rank optimizations. Note that in this problem, both of these implementations are recommended, because omitting one will slow the query time to O(\log (MN)), while omitting both will slow it further to O(NM) in the worst case (not better than before!). Remember that the time limit is strict, so even O(\log (MN))-time queries might still take too long!
Additional read: decremental connectivity in planar graphs
Finally, we refer the reader to continue reading about decremental connectivity. For example, this link contains an efficient algorithm to handle decremental connectivity in the case of planar graphs. It is a really nice read, so we suggest you learn it!
O(NM+Q\cdot \alpha(NM)), where \alpha is the inverse Ackermann function