# Problem Link:

Practice

Contest

**author**: Roman Rubanenko

**tester/editorialist**: Utkarsh Lath

# Difficulty:

Hard

# Pre-requisites:

understanding of rooted trees, diameter of a tree, heavy-light composition, BIT / segtree

# Problem:

Given a tree, you have to find its diameter. But thats not all of it. The tree initially consists of a single node, but it is extended by adding one leaf at a time. After adding each leaf, report the diameter of the tree

# Explanation:

Expected complexity **O( n log n )**, or a not very slow **O( n log**^{2} n).

The Setter and Tester had more complicated solutions than some of those submitted during the contest. Therefore, I will first describe the simple solution:

## Simpler solution

Let vertices **v1** and **v2** be the most distant vertices, forming diameter of the current tree(ties can be broken arbitrarily). If diameter of the tree should increase upon adding a leaf **w**, then the most distant pair is either **w** and **v1**, or **w** and **v2**.

Once the above fact has been proven, we simply need to maintain the most distant node **v1** and **v2**. Upon addition of a node **w**, find its distance from **v1** and **v2**, and update things if we have a better diameter. We only need to find the distance between two arbitrary nodes of our tree efficiently. This can be done by rooting the tree on vertex 0 and finding LCA of the two nodes, and reporting sum of distances of both nodes from LCA. Answering LCA queries is a very standard problem, and this approach to find LCA is sufficient to solve our problem in **O(n log n)** time.

Because of known efficient algorithms for online construction of LCA data-structure, this solution can be used even in fully online setting, with worst case complexity **O(n)**. We thank the contestants for coming up with this beautiful solution.

### Proof that **v1** or **v2** is one end point of a diameter

Consider the path from **v1** to **v2**, of length, say **D**. Diameter of new tree must be **D+1**. Also, **w** must be one endpoint of any pair of most distant vertices because otherwise diameter of original tree would also be **D+1**.

Pick any vertex **v3** most distant from **w**, and consider the path from **v3** to **w**. This path must intersect the path from **v1** to **v2**, because the path from **parent(w)** to **v3** forms a diameter of the original tree and so does the path between **v1** and **v2**, and all diameters of a tree must pass through the centers of a tree.

The two paths **w** to **v3** and **v1** to **v2** intersect as shown. We have

dist(v2, y) ≥ dist(v3, y) -- otherwise v3-v1 is a better diameter of original tree

dist(v2, y) ≤ dist(v3, y) -- otherwise w-v2 is a better diameter of new tree

⇒ dist(v2, w) = dist(v3, w), and v2-w is also a diameter of the new tree

## Our original O(n log n) solution

Lets root the tree at vertex **0**, and use **height (u)** to denote dist(**0**, **u**).

Also, use **depth(u) = max**_{x descendant of u} dist(x, u).

First, lets try to construct a naive solution for updating diameter on adding of tree.

```
add_node(w)
update(w, w, parent(w))
update(v, child, p)
d1 = dist(p, v) = height(v) - height(p)
d2 = max
```_{c' children of p, c' ≠ c} depth(c')
diameter = max(d1+d2+1, diameter)
if(p != root)
update(v, p, parent(p))

The above algorithm goes along the path from **v** to **root**, and for each node, finds the depth of deepest descendant not along the path. Suppose I want to choose a particular path **P** from root to some vertex in the final tree, and make updates fast when a vertex on this path is added. This can be done by storing the following quantity at each node **v**.

**value(v) = max**_{c child of v, c not on Path P} depth(c) + 1 - height(v)

When adding a new vertex **w ∈ P**, the longest path starting from **w** has length **height(w) + max**_{v ancestor of w} value(v). This can be found efficiently (in O(**log n**) time) by maintaining either a **BIT**, or a **segment tree** for the path **P**.

Therefore, if we manage to maintain **value(v)** for all nodes on path **P**, we could efficiently update diameter when adding any node on the path **P**. To efficiently maintain **value(v)** for all nodes on **P**, whenever you add a node **w ∉ P**, find its nearest ancestor on path **P**, and update its **value** parameter in the BIT/segtree for path **P**.

Now that the problem is solved for a path, how to solve it for the entire tree ?
Heavy-light decomposition immediately comes to our mind. Heavy light decomposition(HLD) has been described very well in some of the previous editorials and outside 1, 2.
Any rooted tree can be completely split into a family **F** of paths such that path from any node to root intersects at most **log n** paths from the family. Therefore, when adding a new vertex **v**, you will have to query **O(log n)** paths for the quantity **height(w) + max**_{v ancestor of w} value(v). Since all the queries ask for maximum of **value(v)** in some prefix of the path, it can be done with maintaining a BIT. Also, we would need to update/query only **O( log n)** paths because of properties of HLD.

Special handling is needed when switching between paths.

Along with **max**_{x ancestor of u} value(x), we would need to account for deepest descendant of **u** along paths of type **Q** and **R**. **max**_{x descendant of u on path P} value(x) + 2 * height(x) - height(u) accounts for deepest leaf along paths of type **Q**. We could maintain another BIT for maintaining **value'(x) = value(x) + 2 * height(x)**, which can be updated just like **value(x)**, and query for **max**_{suffix of path P} value'(x) can be efficiently answered by building a BIT.

Handling paths of type **R** is a bit tricky. We will need **max**_{c child of u, c ≠ v} depth(c). This would make the complexity of algorithm **O(max-degree)** for each new vertex added. Can we make it just **O(log n)** ?

This is possible, and can be done by storing at each node the **depth** of *deepest* and *second deepest* child which does not lie on the path (i.e. vertex v stores **max**_{c child of u, c ≠ v} depth(c) and **2**^{nd}-max_{c child of u, c ≠ v} depth(c)). If the **v** is the deepest child of **u**, then **max**_{c child of u, c ≠ v} depth(c) is depth of second deepest child, otherwise it is depth of deepest child. This can also be efficiently updated on adding vertices in **O(1)** time per update. Addition of a new vertex will cause **O(log n)** such updates, because adding a vertex will affect this parameter for **O(log n)** nodes only - the points where the path is switched.

The overall complexity of this solution is **O(n log**^{2} n), and is perfectly acceptable. To get complexity down to **O(n log n)**, one final observation is required:

When switching paths, it makes sense to go further up iff **w** is **the** *deepest descendant* of **v**. Because otherwise, **depth** of any ancestor of **u** does not change because of addition of **w**. Also, the diameter does not increase because in any diameter having **w** as an end point, **w** can be replaced by this deepest descendant.

Therefore, if we are querying(and updating) the path **P**, its cost can be taken from the increase in depth of node **u**. The overall complexity of our solution could be written as **O(n + log n * Σ**_{Path P in decomposition of tree} depth of first node of path P). In HLD, the mysterious expression above is not guaranteed to be **O(n log n)**, but it is hard to design test data to make the above quantity **O(n log**^{2} n) for Heavy light decomposition of tree. However, the above quantity can be easily proved to be **O(n log n)** if we use another decomposition scheme - where the deepest child is picked, instead of heaviest child as in HLD.

**Behind the scene:** The tester used the above **O(n log n)** approach, and the author used a similar but different **O(n log**^{2} n) solution. The Tester initially thought that the complexity of his solution was also **O(n log**^{2} n), but his solution was 10 times faster than setters inspite of applying all kinds of speedups in setters solution. Trying to hunt down the difference, the tester concluded that his complexity was actually better due to reasons given above.

# Solutions:

Setter's solution

Tester's Solution