# SUBGRAPH - Editorial

Author: Kanstantsin Sokal
Tester: Jingbo Shang
Editorialist: Lalit Kundu

Medium-Hard

### PREREQUISITES:

DP on trees, graphs

### PROBLEM:

You are given an undirected graph G of N nodes and M edges. It’s guaranteed that no multiple edges or self-loops appear in the graph as well as that the graph is connected. There’s one more restriction on the graph: any node of G belongs to at most one simple cycle in G.

Your task is to count the subsets of nodes in G, such that the nodes in the subset are connected (you can reach any node of the subset from any other node of the subset only by moving to adjacent nodes that belong to the subset) and contain no more than K nodes.

### QUICK EXPLANATION:

================
Cycle main node
Each cycle has the main node: it’s a node, which, if we make the graph G rooted, will be connected to some node of the cycle, which is an ancestor cycle for our cycle.

For each cycle main node v, we define \text{f}(v, k) as the number of subgraphs rooted at node v and having k nodes in it. This k also includes non-main nodes and not just the nodes in compressed tree.
For each non-main node, we define we define \text{g}(v, k) as the number of subgraphs rooted at node v and having k nodes in it and these subgraphs don’t have any of the nodes of \text{cycle}[v]. That means, it only depends on the main nodes that come out of node v. So, we can recursively define \text{g}(v, k) based on values of f.

Now, for recursively calculating values of f, we consider each subsegment of cycle of main node v and, then for all subsegments, for all nodes in current subsegment calculate number of subgraphs using values of g. If such a subsegment has node v in it, then we add the value calculated to f(v).

### EXPLANATION:

================
DP solution to realise in this problem is a little difficult. I am going to explain author’s solution here.
First lets discuss an easier problem which says: Given a tree calculate number of subtree’s with exactly K nodes. Here we need to use DP on tree. For each node v, we define \text{dp}(v, k) as number of subtrees with k nodes and v as root. Now, we can define recurrence relation for this. Let’s say for node v, there are direct children nodes u_1, u_2, ... u_n. Now, to form a subtree with k+1 nodes rooted at v, lets say direct children nodes contribute a_1, a_2, ..., a_n nodes, where a_1 + a_2 + ... + a_n = k, in this case we add to \text{dp}(v, k) the value \text{dp}(u_1, a_1)*\text{dp}(u_2, a_2)*...*\text{dp}(u_n, a_n). We should do this for all possible a_1, a_2, ..., a_n.

To do this, we create one more DP here \text{dp1}(i, j) as number of ways to choose a total of j nodes from subtrees defined by u_1, u_2, ..., u_i. The recurrence can be defined as \text{dp1}(i, j) = \sum_{k=0}^{K} \text{dp1}(i-1,j-k)*\text{dp}(i, k). So, in terms of pseudo code we can write something like:

dp[N][K+1]

void rec(int cur_node){

dp[cur_node][1]=1

for(all v such that v is children of cur_node)
rec(v)

dp_buffer[K] = {0}
for i=0 to K:
for j=0 to K-i:
dp_buffer[i + j] += dp[i]*dp[v][j]

dp[cur_node] = dp_buffer
}


Hence, our problem is solved. We do something similar in our given graph, but since cycles are involved, we need to do something improvised. You should here notice that if we compress all cycles to form a single node, then our graph is a tree, say T. You should also note that non-bridge edges in G are the edges which will be the edges of tree T.

First, a definitions I’ll be using:
Cycle main node
Each cycle has the main node: it’s a node, which, if we make the graph G rooted, will be connected to some node of the cycle, which is an ancestor cycle for our cycle. Or, in other words after rooting G, if I apply DFS then main node of a cycle will be visited earliest. For example, in following image nodes 1, 2, 5, and 9 are main nodes, if G is rooted at node 1.

All other nodes can be called as non-main nodes.

Now, first we parse cycles in graph G, which means compressing each cycle to a cycle main node and for each cycle store the vertices in it. Let’s denote by \text{cycle}[v] as the cycle id of node v. Now, we are going to compute DP on this compressed tree and at the same time also maintain data about the cycles in it.

So, we define \text{f}(v, k) for all main nodes v, the number of subgraphs rooted at node v and having k nodes in it. This k also includes non-main nodes and not just the nodes in compressed tree. Now, we cannot just calculate function f recursively using its own definition, we need to also include the fact that main node v is not just a single node but a cycle.

So, for non-main nodes v, we define \text{g}(v, k) as the number of subgraphs rooted at node v and having k nodes in it and these subgraphs don’t have any of the nodes of \text{cycle}[v]. That means, it only depends on the main nodes that come out of node v. So, we can recursively define \text{g}(v, k) based on values of f. This DP is quite similar to what we discussed in the easier problem. So, in terms of pseudo code we can write:

g[N][K]

void rec(int cur_node, int prev){

g[cur_node][1]=1

for(all edges e incident with cur_node):
//we just want to consider children of cur_node
//which are main nodes
if e is the edge (cur_node, prev) or if e is not a bridge:
continue

u = neighbor of v via edge e
rec(u, v)
dp_buffer[K] = {0}
for i=0 to K:
for j=0 to K-i:
dp_buffer[i + j] += g[i]*f[u][j]

g[cur_node]=dp_buffer

}


Now, that we have calculated g array at current main node, lets see how we will calculate f for the current main node again, based on recursive definition. Now, you should notice how we can make subgraphs at a main node. For example, in the following image, the six different color curves denote which non-main nodes are we going to include in our subgraph at node v.

So, the different subgraphs we have made are:

• color pink: v, S(u_3), S(u_1), where S(x) denotes the subtree of node x.
• color green: v, S(u_1), S(u_2), S(u_3)
• color red: v, S(u_1)
• color blue: v, S(u_1), S(u_2)
• color yellow: v, S(u_2), S(u_3)
• color cyan: v, S(u_3)

You should notice that we are considering all subsegments in cycle of node v(because we need subgraph connected) and then for each subsegment we can use g to calculate number of subgraphs which include subtrees of all these nodes. Also, if for a subsegment main node lies in that segment we can use it to calculate f for main node v.

Lets formalise things a bit. Let’s say you are considering subsegment of nodes formed by nodes u_1, u_2, ..., u_n. We already have calculated g(u_i, k) for all i and k. Now, at this step we’ll also be adding the subgraphs caused by this subsegment to our final answer. We define h(k) as number of such subgraphs with k nodes. So, the subgraph we are considering is formed by S(u_1), S(u_2), ..., S(u_n). So, for calculating h(k+n) we assume that we have choosen a_1, a_2, ..., a_n from each of these subtrees, where a_1 + a_2 + ... + a_n = k, so to h(k+n), we add g(u_1, a_1) * g(u_2, a_2) * ... * g(u_n, a_n). Now, we should do this for all possible sets of a_i.

Again, for calculating this we can do very similar to what we did in the easier problem in beginning.

h = g(u_1)
for i = 2 to n:
dp_buffer[K] = {0}

for j = 1 to K:
for k = 1 to K-i:
dp_buffer[j + k] += h[j]*g(u[i], k)

h = dp_buffer

//at this step dp_buffer stores h(k) for the set of nodes
//u_1, u_2, u_3, ... u_i
//so to our final answer we should add this whole array h
for j = 1 to K:
ans[j] += h[j]

//if subsegment u_1 to u_i contains main-node v
//then to f(v, i) we should add h(i) for all i
if subsegment u_1 to u_i contains main-node v:
for j = 1 to K:
f(v, j) += h[j]


We do the above process for all possible u_1 i.e. all possible subsegment starts. You need to take care of a few things:

• Be sure to count the whole cycles as a subsegment only once(it can be done to iterating through the whole cycle only from the main node of the cycle)
• Be sure to consider g(u_i, k) for k \ge 1, i.e. node u_i is present always(otherwise chain will not be connected).

So, in this way we we’ll have iterated over all possible subsegments which can possibly contribute to the answer. Now, we just have to take sum of \text{ans}[i], for all i \le K.

### COMPLEXITY

================
Let’s say l is size of cycles and n is number of cycles.
Then for each subsegment of cycle i.e. a total of l^2 subsegments, we do a computation of K^2. Also, for calculating g for each non-main node, we do a computation of K^2, where non-main nodes could be atmost N.
So, our complexity is O(n * (l^2 * K^2) + N * K^2) which we can say is less than O(n * (l * L * K^2) + N * K^2) = O((N * L * K^2) + N * K^2) = O(N * L * K^2).

### AUTHOR’S, TESTER’S SOLUTIONS:

2 Likes

Comment regarding tester’s(@shangjingbo) approach.

"The key condition is that every node belongs to at most ONE simple cycle. Besides, the input graph is connected.

Assume we are doing DFS on the graph, as it is an undirected graph, there are only several back-edges except for the tree-edges. Moreover, there will be no intersections between back-edges (since the simple cycles have no intersections).

Therefore, we can use (node u, total number of selected nodes, number of components in subtree u, whether u is selected, whether there is a component has a back-edge going beyond node u) to describe the states. Number of components <= 2, others are all binary indicators. Therefore, the total number of states is O(N*K). There is another dynamic programming among all children of node u for the transition part. Therefore, the total time complexity is O(N*K^2). You can check the details in my submission."