KOL1503 - Editorial

kevinsogo · December 26, 2015, 6:24am

PROBLEM LINK:

Contest
Practice

Author: Devendra Agarwal
Tester: Kevin Atienza
Editorialist: Kevin Atienza

PREREQUISITES:

Lowest common ancestors, balanced parentheses

PROBLEM:

Given a tree with N nodes, each node containing a parenthesis—) or (—, answer Q queries of the following kind:

Given two nodes u and v, consider the string obtained by taking all parenthesis characters from every node in the path from u to v. Output Yes if this string is a balanced string of parentheses, otherwise print No.

QUICK EXPLANATION:

Root the tree. A path from a to b consists of two parts: A path from nodes a to b consists of two parts: the path from a to \text{LCA}(a,b) and from \text{LCA}(a,b) to b. Take care that the node \text{LCA}(a,b) is counted only once. By convention, let’s say this node belongs at the first part of the path.

The string in the path from a to b is a valid parenthesization iff the following hold:

The path from a to \text{LCA}(a,b) is a prefix of a valid parenthesization.
The path from \text{LCA}(a,b) to b is a suffix of a valid parenthesization.
The number of unmatched open parentheses from a to \text{LCA}(a,b) is equal to the number of unmatched closed parentheses from \text{LCA}(a,b) to b.

To check these conditions, we must be able to perform the following operations:

Given some path from some node to one of its ancestors, it is a prefix of a valid parenthesization?
Given some path from some node to one of its ancestors, it is a suffix of a valid parenthesization?
Given some node x and one of its ancestors y, where the path from x and y is a prefix of a valid parenthesization, how many unmatched open parentheses are there?
Given some node x and one of its ancestors y, where the path from y and x is a suffix of a valid parenthesization, how many unmatched closed parentheses are there?

The latter two can be done by precomputing f(x) for each node x, where f(x) is the number of open parentheses minus the number of closed parentheses from x to the root. It can be computed using the recurrence:

f(x) = \begin{cases} f(\text{parent}(x)) + 1 & \text{If the parenthesis at node $x$ is open} \\\ f(\text{parent}(x)) - 1 & \text{If the parenthesis at node $x$ is closed} \end{cases}

The former two can be done by precomputing \text{pre}(x) and \text{suf}(x) for each node x, which are defined as the lengths of the longest path from x to the root that is a valid parenthesization prefix and suffix, respectively.

EXPLANATION:

As with many problems involving an undirected tree, the natural first step is to root it. This usually helps. So let’s say the tree is rooted at node 1.

A path from nodes a to b consists of two parts: the path from a to \text{LCA}(a,b) and from \text{LCA}(a,b) to b (where \text{LCA}(a,b) is the lowest common ancestor of a and b). But since the parentheses are on the nodes, not edges, so we need to ensure we count \text{LCA}(a,b) only once. By convention, let’s say this node belongs at the first part of the path, i.e. from the path from a to \text{LCA}(a,b), and when we say the “path from \text{LCA}(a,b) to b”, the node \text{LCA}(a,b) isn’t actually included.

Obviously, the string of parentheses in the path from a to \text{LCA}(a,b) must be a valid prefix of some valid parenthesization. Similarly the string from \text{LCA}(a,b) to b must be valid suffix. These are necessary conditions. However, they’re not sufficient. For example, (((( is a valid prefix and )) is a valid suffix but combining them, (((()), doesn’t yield a valid parenthesization. Obviously, we’re missing a condition: the number of unmatched parentheses in both strings must be equal. It’s easy to see that these are sufficient conditions, i.e. the path from a to b contains a valid parenthesization iff the following conditions hold:

The path from a to \text{LCA}(a,b) is a prefix of a valid parenthesization.
The path from \text{LCA}(a,b) to b is a suffix of a valid parenthesization.
The number of unmatched open parentheses from a to \text{LCA}(a,b) is equal to the number of unmatched closed parentheses from \text{LCA}(a,b) to b.

(Remember the convention that \text{LCA}(a,b) only belongs to the path from a to \text{LCA}(a,b).)

So we want perform the following operations quickly:

Given some path from some node to one of its ancestors, it is a prefix of a valid parenthesization?
Given some path from some node to one of its ancestors, it is a suffix of a valid parenthesization?
Given some node x and one of its ancestors y, where the path from x and y is a prefix of a valid parenthesization, how many unmatched open parentheses are there?
Given some node x and one of its ancestors y, where the path from y and x is a suffix of a valid parenthesization, how many unmatched closed parentheses are there?

But the latter two can be computed with the following: Let f(x) be the number of open parentheses minus the number of closed parentheses from x to the root (node 1). Then:

The number of unmatched parentheses from x to y is just |f(x) - f(\text{parent}(y))|. (If y = 1, then we say f(\text{parent}(y)) = 0 by convention.)
Also, these unmatched parentheses are open parentheses if f(x) > f(\text{parent}(y)), otherwise they’re closed parentheses.

Thus, having the f(x) values allows us to perform the operations above. Additionally, the $f(x)$s can be precomputed in linear time with a single traversal of the tree, using the following:

If the parenthesis at node x is (, then f(x) = f(\text{parent}(x)) + 1. Otherwise, f(x) = f(\text{parent}(x)) - 1.

The precomputation runs in linear time. What remains is to efficiently answer the following queries:

Given some path from some node to one of its ancestors, it is a prefix of a valid parenthesization?
Given some path from some node to one of its ancestors, it is a suffix of a valid parenthesization?

These two operations are very similar, so we will just describe one of them, say the first one. To perform this query, we use the following characterization of a valid parenthesization prefix:

A string is a prefix of a valid parenthesization if every prefix of it contains at least as many open parentheses as there are closed parentheses.

In terms of our tree, this is equivalent to saying that:

If y is an ancestor of x, then the path from x to y forms a prefix of a valid parenthesization if and only if f(x) \ge f(\text{parent}(z)) for every node z in the path.

That this is true can be seen from the meaning of the inequality f(x) \ge f(\text{parent}(z)). Note that f(x) - f(\text{parent}(z)) is the number of unmatched open parentheses from x to z, so if this is \ge 0, for every node z in the path, then the condition for being a valid parenthesization prefix is satisfied.

For every node x, we define \text{pre}(x) as the length of the longest path from x to the root that is a valid parenthesization prefix (in terms of the number of nodes). Using the property above, the path from x to its ancestor y is a valid parenthesization prefix if and only if \text{depth}(x) - \text{depth}(y) < \text{pre}(x). So we can answer the queries if we have \text{depth}(x) and \text{pre}(x) for all nodes x. Precomputing the depths of nodes can be done with a single traversal, so what remains is to compute \text{pre}(x) for all nodes x.

We will also compute the $\text{pre}(x)$s with a single traversal. Let z be the $\text{pre}(x)$th ancestor of x. If z is not the root, then let y = \text{parent}(z). Then y is the first node such that f(x) < f(y). Since the f s only differ by one between adjacent nodes, this means that f(z) must be equal to f(x), and f(y) must be equal to f(x) + 1. So \text{pre}(x) is just the distance of x to its nearest ancestor y whose f(y) is f(x) + 1. (And if no such y s exist, then \text{pre}(x) is just \text{depth}(x)+1.) Obviously, if the parenthesis at x is ), then \text{pre}(x) is automatically 0.

Thus, as we traverse the tree, we must be able to answer the following query quickly:

Given v, find the nearest ancestor y of the current node such that f(y) = v.

This can be done by maintaining a map of lists, which lists for each v the nodes y along the path to the root whose f(y) = v. The following pseudocode illustrates it better:

// arrays are initialized to zero
f = array[0...N]
parent = array[1...N]
depth = array[0...N]
pre = array[1...N]
m = {}

def PUSH(x,d):
    if d is not a key of m:
        m[d] = new empty list()

    add x at the end of m[d]

def GET(d):
    if d is not a key of m:
        return 0
    else:
        return the last element of m[d]

def POP(x,d):
    remove x from the end of m[d]
    if m[d] is empty:
        delete the key d from m

def traverse(x):

    // compute f[x]
    if parenthesis(x) == '(':
        f[x] = f[parent[x]] + 1
    else:
        f[x] = f[parent[x]] - 1
    
    // compute depth[x]
    depth[x] = depth[parent[x]] + 1
    
    // compute pre[x]
    if parenthesis(x) == ')':
        pre[x] = 0
    else:
        y = GET(f[x] + 1)
        pre[x] = depth[x] - depth[y]

    // add x
    PUSH(x, f[x])

    for j in adj[x]:
        if j != parent[x]:
            parent[j] = x
            traverse(j)

    // remove x
    POP(x, f[x])

// initial call is traverse(1)

A similar algorithm can be done to compute \text{suf}(x) for all nodes x, which is defined as the length of the longest path from x to the root that is a valid parenthesization suffix. Using \text{pre}(x) and \text{suf}(x), one can now answer the queries above in O(1) time!

By precomputing f(x), \text{pre}(x) and \text{suf}(x) for all nodes x, each query can be answered in O(\log N) time (because we’re taking the LCA of the two nodes.)

Note: If you find the number of unmatched parentheses in this document a bit disturbing, here’s something that can make it worse.

Time Complexity:

O((N + Q)\log N)

AUTHOR’S AND TESTER’S SOLUTIONS:

setter
tester