DENSEGRP - Editorial

sawarnik123 · March 7, 2021, 11:35am

PROBLEM LINK:

Practice
Div-2 Contest
Div-1 Contest

Author: Sawarnik Kaushal
Tester: Felipe Mota
Editorialist: Sawarnik Kaushal

DIFFICULTY:

Easy-Medium

PREREQUISITES:

Segment Tree, Coordinate Compression

PROBLEM:

You are given an unweighted directed graph whose edges are given by:

For each i, where 1 \le i \le M, and for each pair of vertices u \in [A_i, B_i], v \in [C_i, D_i], there is an edge from u to v.

Find the shortest distance between vertices X and Y.

QUICK EXPLANATION:

First, we compress all the vertices given in the input, so that the number of vertices becomes O(M).
Build a graph as shown in the figure below, of size equal to the number of compressed vertices. The tree edges are weighted 0. Then for each valid i, decompose [a_i, b_i] into subsegments in the lower half and draw an edge from each of them to each subsegment of [c_i, d_i] in the upper half. All these edges should be of weight 1. In this way, the graph consists of O(M\log^2 M) edges. Now, we can find the answer by running 0-1 BFS from source [x, x], and finding the distance to [y, y]. This solution takes O(M\log^2 M) time.
Instead of drawing an edge from each subsegment of [a_i, b_i] to each of [c_i, d_i], we can define an extra vertex v_i and draw an edge from each subsegment of [a_i, b_i] to v_i, followed by an edge from v_i to each subsegment of [c_i, d_i]. The graph now consists of a total of O(M\log M) edges. We can then run 0-1 BFS as before, but divide by 2 to get the final answer. Thus, the problem is solved in O(M\log M) time.

EXPLANATION:

Firstly, we can compress the input so that instead of dealing with N vertices, which can be upto 10^9, we only need to deal with O(M) vertices in our graph, as there are only 4M + 2 vertices to compress. Let us represent the number of vertices after compression to be K.

Motivation

We can start by naively building the graph as given in the problem, followed by a BFS. The problem with this approach is that for each valid i, we require O(M^2) edges to be added to the graph in the worst case. Clearly, we need a structure which can represent that same information with far fewer edges. We can recognize that a segment tree is well-suited for such a role. For instance, let us build a segment tree of size K. Then for each valid i, we can decompose [a_i, b_i] into O(\log K) subsegments of the segment tree, and draw an edge from all those subsegments to all subsegments of [c_i, d_i]. After some experimentation with similar kind of structures, we can realize that the following graph works for us.

Solution

Figure: The graph for K = 5. Tree edges of weight 0 are shown in blue. For an example, edges of weight 1 from [4, 5] to [2, 5] are shown in green.

Let us construct this graph, i.e. the two segment trees of size K joined together as shown in the figure. We initially assign all the tree edges weight 0. Now, we start iterating over each valid i. As we do in any segment tree, we can partition every segment [a, b] into O(\log K) subsegments of the segment tree. Thus, for every [a_i, b_i] we draw an edge from each of its subsegments in the lower half to each subsegment of [c_i, d_i] in the upper half. We draw these edges with weight 1. So why does this graph work?

We show that if there exists an edge from vertex p to q in the original graph, then and only then there exists a path of distance 1 from segment [p, p] to [q, q] in our constructed graph. Note that if there is an edge from p to q, there must be an i such that p \in [a_i, b_i] and q \in [c_i, d_i]. Thus, there must be a subsegment of [a_i, b_i] in the segment tree which includes p, let us call it [a'_i, b'_i], and a subsegment of [c_i, d_i] including q, which we can call [c'_i, d'_i].

It is easy to see that all segments in a segment tree that include p, lie on the path from [p, p] to the root. Thus, we can move downwards from [p, p] in our graph and reach [a'_i, b'_i] in the lower half of the graph. Further, as we had drawn an edge from each subsegment of [a_i, b_i] to each subsegment of [c_i, d_i], there exists an edge from [a'_i, b'_i] in the lower half to [c'_i, d'_i] to the upper half. So we can move to [c'_i, d'_i] and from there on we can move downwards in the tree until we reach [q, q]. Thus, we have shown that a path exists from [p, p] to [q, q]. Also, notice that we used exactly 1 unit of distance in doing so. The only-if part of the proof is very similar.

Hence, any path from x to y of length l in our original graph corresponds to a path from [x, x] to [y, y] of the same distance l in our constructed graph, and vice versa. So how can we find the shortest path from [x, x] to [y, y]? The simplest way is to use 0-1 BFS with source vertex [x, x], which is possible since all our edges are of weight 0 or 1. Alternatively, we can use Dijkstra’s but that will result in an extra log factor in the time complexity.

Time complexity

Remember that the compression phase took O(M\log M) time. Now, as building a segment tree of size K requires O(K) edges our initial graph consisted of O(K) edges as well, or equivalently O(M) edges. Then, for each i we divided the segments [a_i, b_i] and [c_i, d_i] into their O(\log M) subsegments, and we defined an edge from each subsegment of [a_i, b_i] to each of [c_i, d_i].

This means that for every i, we inserted O(\log^2 M) edges in our graph. The total edges in our graph becomes O(M + M\log^2M) , and it is not difficult to see that the time taken was O(M\log^2M) as well. Further, as 0-1 BFS takes O(V + E) time, this part also takes O(M\log^2M) time. Hence, our final time complexity is O(M\log^2M).

Further reducing the time complexity

We can further reduce the time complexity to O(M\log M) very simply. For each valid i, instead of drawing an edge from each subsegment of [a_i, b_i] to each of [c_i, d_i], we can define an extra vertex v_i and draw an edge from each subsegment of [a_i, b_i] to v_i, followed by an edge from v_i to each subsegment of [c_i, d_i]. As before, we draw these edges of weight 1. Note that we only used O(\log M) edges here instead of O(\log M^2) previously.

With this modification, we can easily show that any edge from p to q in the original graph now corresponds to a path from segment [p, p] to [q, q] of distance 2 in our new graph. Indeed, the only change in the previous proof is that the distance of [a'_i, b'_i] to [c'_i, d'_i] is now 2, but otherwise the same path exists. Thus, we can run 0-1 BFS as before from [x, x] to [y, y] in this graph, but divide by 2 to get the final answer. Since the graph now consists of only O(M\log M) edges, our final time complexity is O(M\log M) as well.

SOLUTIONS:

Setter's Solution

#include "bits/stdc++.h"
#define endl '\n'
using namespace std;
 
#define all(v) v.begin(), v.end()
#define sz(v) (int)(v.size())
#define rz(v, n) v.resize((n) + 1);
#define pb push_back
#define fi first
#define se second
#define vi vector <int>
#define pi pair <int, int>
#define vpi vector <pi>
#define vvi vector <vi>
#define FOR(i, a, b) for (int i = (a); i <= (b); i++)
#define FORD(i, a, b) for (int i = (a); i >= (b); i--)
const int inf = 1e9;
const int N = 2e5 + 1;
 
int t, n, m, x, y, xver, yver;
vector <vector<pair <int, bool>>> adj;
 
void build (int v, int start, int end)
{
    if (start == end) 
    {
        adj[v/2].pb({v, 0});
        adj[v].pb({4*n + v/2, 0});
        if (start == x) xver = v;
        if (start == y) yver = v;
    }
    
    else
    {
        adj[v/2].pb({v, 0});
        adj[4*n + v].pb({4*n + v/2, 0});
        
        int mid = (start + end)/2;
        build(2*v, start, mid);
        build(2*v + 1, mid + 1, end);
    }
}
 
void segsearch (int v, int start, int end, int l, int r, int icnt, bool state)
{
    int mid = (start + end)/2;
    if (start >= l && end <= r)
    {
        if (state) adj[6*n + icnt].pb({v, 1});
        else if (start == end) adj[v].pb({6*n + icnt, 1});
        else adj[4*n + v].pb({6*n + icnt, 1});
    }
    
    else if (end < l || start > r) return;
    else
    {
        segsearch(2*v, start, mid, l, r, icnt, state);
        segsearch(2*v + 1, mid + 1, end, l, r, icnt, state);
    }
}
 
int bfs()
{
    vi d(6*n + m + 1, inf);
    deque <int> q;
    d[xver] = 0;
    q.push_front(xver);
    
    while (!q.empty()) 
    {
        int v = q.front();
        q.pop_front();
        for (pi e: adj[v]) 
        {
            int u = e.fi, w = e.se;
            if (d[v] + w < d[u]) 
            {
                d[u] = d[v] + w;
                if (w) q.push_back(u);
                else q.push_front(u);
            }
        }
    }
    
    if (d[yver] < inf) return d[yver]/2;
    else return -1;
}
 
signed main()
{
	ios_base::sync_with_stdio(false);
	cin.tie(0);
    
    cin >> t;
    while (t--)
    {
        cin >> n >> m >> x >> y;
        
        vi vec;
        set <int> s = {x, y};
        FOR (i, 1, m)
    	{
    		int a, b, c, d;
    		cin >> a >> b >> c >> d;
    		vec.pb(a), vec.pb(b), vec.pb(c), vec.pb(d);
    		s.insert(a), s.insert(b), s.insert(c), s.insert(d);
    	}
    	
    	map <int, int> mp;
    	int cnt = 0;
    	for (int i: s) mp[i] = ++cnt;
    	FOR (i, 0, sz(vec) - 1) vec[i] = mp[vec[i]];
    	x = mp[x], y = mp[y];
    	
    	n = sz(s);
    	adj.clear();
    	adj.resize(6*n + m + 1);
    	
    	build(1, 1, n);
    	
    	int i = 0, icnt = 1;
    	while (i < sz(vec))
    	{
    	    segsearch(1, 1, n, vec[i], vec[i + 1], icnt, 0);
    	    segsearch(1, 1, n, vec[i + 2], vec[i + 3], icnt, 1);
    	    i += 4, icnt++;
    	}
    	
    	cout << bfs() << endl;
    }
}

Tester's Solution

#include <bits/stdc++.h>
using namespace std;
template<typename T = int> vector<T> create(size_t n){ return vector<T>(n); }
template<typename T, typename... Args> auto create(size_t n, Args... args){ return vector<decltype(create<T>(args...))>(n, create<T>(args...)); }
long long readInt(long long l,long long r,char endd){
	long long x=0;
	int cnt=0;
	int fi=-1;
	bool is_neg=false;
	while(true){
		char g = getchar();
		if(g=='-'){
			assert(fi==-1);
			is_neg=true;
			continue;
		}
		if('0'<=g && g<='9'){
			x*=10;
			x+=g-'0';
			if(cnt==0){
				fi=g-'0';
			}
			cnt++;
			assert(fi!=0 || cnt==1);
			assert(fi!=0 || is_neg==false);
 
			assert(!(cnt>19 || ( cnt==19 && fi>1) ));
		} else if(g==endd){
			if(is_neg){
				x= -x;
			}
			assert(l<=x && x<=r);
			return x;
		} else {
			cout << x << '\n';
			cout << l << ' ' << r << ' ' << (endd == ' ' ? "sp" : "ln") << '\n';
			cerr << int(g) << ' ' << char(g) << '\n';
			assert(false);
		}
	}
}
string readString(int l,int r,char endd){
	string ret="";
	int cnt=0;
	while(true){
		char g=getchar();
		assert(g!=-1);
		if(g==endd){
			break;
		}
		cnt++;
		ret+=g;
	}
	assert(l<=cnt && cnt<=r);
	return ret;
}
long long readIntSp(long long l,long long r){
	return readInt(l,r,' ');
}
long long readIntLn(long long l,long long r){
	return readInt(l,r,'\n');
}
string readStringLn(int l,int r){
	return readString(l,r,'\n');
}
string readStringSp(int l,int r){
	return readString(l,r,' ');
}
long long TEN(int p){ long long r = 1; while(p--) r *= 10; return r; }
int main(){
	ios::sync_with_stdio(false);
	cin.tie(0);
	int t = readIntLn(1, TEN(5));
	int sum_m = 0;
	while(t--){
		int n, m, x, y;
		n = readIntSp(1, TEN(9));
		m = readIntSp(1, 2 * TEN(5));
		x = readIntSp(1, n);
		y = readIntLn(1, n);
		sum_m += m;
		vector<int> ys;
		ys.push_back(n + 1);
		ys.push_back(x);
		ys.push_back(y);
		vector<tuple<int,int,int,int>> edges;
		for(int i = 0; i < m; i++){
			int a, b, c, d;
			a = readIntSp(1, n);
			b = readIntSp(a, n);
			c = readIntSp(1, n);
			d = readIntLn(c, n);
			ys.push_back(a);
			ys.push_back(b + 1);
			ys.push_back(c);
			ys.push_back(d + 1);
			edges.emplace_back(a, b, c, d);
		}
		sort(ys.begin(), ys.end());
		ys.erase(unique(ys.begin(), ys.end()), ys.end());
		auto get_coord = [&](int val){
			return lower_bound(ys.begin(), ys.end(), val) - ys.begin();
		};
		n = ys.size();
		x = get_coord(x);
		y = get_coord(y);
		for(int i = 0; i < m; i++){
			auto [a, b, c, d] = edges[i];
			a = get_coord(a);
			b = get_coord(b + 1) - 1;
			c = get_coord(c);
			d = get_coord(d + 1) - 1;
			edges[i] = make_tuple(a, b, c, d);
		}
		vector<vector<pair<int,int>>> seg(4 * n);
		function<void(int,int,int,vector<tuple<int,int,int,int>>&)> insert_on_seg = [&](int id, int l, int r, vector<tuple<int,int,int,int>> &to){
			if(to.empty()) return;
			vector<tuple<int,int,int,int>> left, right;
			int m = (l + r)>>1;
			for(auto [a, b, c, d] : to){
				if(l == a && b == r){
					seg[id].push_back({c, d});
				} else {
					if(a > m) right.push_back(make_tuple(a, b, c, d));
					else if(b <= m) left.push_back(make_tuple(a, b, c, d));
					else {
						left.push_back(make_tuple(a, m, c, d));
						right.push_back(make_tuple(m + 1, b, c, d));
					}
				}
			}
			if(l == r)
				return;
			insert_on_seg(id<<1, l, m, left);
			insert_on_seg(id<<1|1, m + 1, r, right);
		};
		insert_on_seg(1, 0, n - 1, edges);
		vector<pair<int,int>> nxt;
		function<void(int,int,int,int)> fill_nxt = [&](int id, int l, int r, int p){
			for(auto e : seg[id]) nxt.push_back(e);
			seg[id].clear();
			if(l == r)
				return;
			int m = (l + r)>>1;
			if(p <= m) fill_nxt(id<<1, l, m, p);
			else fill_nxt(id<<1|1, m + 1, r, p);
		};
		set<int> missing;
		for(int i = 0; i < n; i++)
			if(i != x)
				missing.insert(i);
		vector<int> dist(n, -1);
		dist[x] = 0;
		queue<int> q;
		q.push(x);
		while(!q.empty()){
			int cur = q.front(); q.pop();
			nxt.clear();
			fill_nxt(1, 0, n - 1, cur);
			for(auto [l, r] : nxt){
				while(true){
					auto p = missing.lower_bound(l);
					if(p == missing.end() || *p > r) break;
					dist[*p] = dist[cur] + 1;
					q.push(*p);
					missing.erase(p);
				}
			}
		}
		cout << dist[y] << '\n';
	}
	assert(1 <= sum_m && sum_m <= 2 * TEN(5));
	return 0;
}

aknov711 · March 15, 2021, 2:12pm

How are we decomposing a given segment into O(logK) subsegments? For example-> If we have segment [1 to 20]. How will we decompose it? Obviously it can’t be like [1 1],[2 2],…[20 20].

carre · March 15, 2021, 2:51pm

if your full range is a power of 2 then all nodes widths are powers of 2, so [1-20] → [1-16] + [17-20]

sawarnik123 · March 15, 2021, 3:42pm

It is done as we do in a segment tree normally. Like if we have a sum segment tree, and an update for [1, 20] comes in then we do go down the tree and the segment gets divided into subsegments at the nodes we update the sum value. So if K = 20, then the root node [1, 20] is our only subsegment. If K = 32, we will go down the root [1, 32] into [1, 16] and [17, 32]. Then [1, 16] gets chosen, and then we go down from [17, 32] further, onto [17, 24] then finally [17, 20] which is chosen. Simply saying, as we do update queries in a segment tree.

vprocoder · March 15, 2021, 4:30pm

can anyone help me in understanding last (3rd) test cases. I didn’t get it till now.

diegohdmgz · March 15, 2021, 5:34pm

Nice solution ! I did something a little bit different (it’s kind of similar).
Let’s define the distance array in the graph as d and start with all d[u] = +\infin for all nodes, except for s that will have d[s]=0

Let’s maintain a list of unused pair of ranges ([a_k, b_k], [c_k, d_k])
Now, let’s do the bfs. If we are in the phase c of the BFS (starting from phase 1) , all the nodes u that will be processed in that phase would have d[u]=c.

Instead of processing every node separately, let’s process ranges. If we have an unused pair of ranges such that d[i] =+\infin for all i \in [a_k, b_k] , then obviously this pair won’t be used in the current phase of the bfs, otherwise this pair must be used and let’s called this type of pair “active”. So we can gather all the “active” pairs of ranges in this phase and set d[i] = min(c, d[i]) for all i \in [c_k, d_k] (the min operation is necessary because there might be some nodes in that range that have already a distance from a previous phase), and after that, this pair will be remove from the list of unused pairs. This assignment in range can be done with Segment Tree.
Each assignment that we do with a pair ([a_k, b_k], [c_k, d_k]), can also produce some new “active” pairs for the next phase. These new active pairs (that are still unused and are not active in the current phase) ([a_t, b_t], [c_t, d_k]) will be all the ones such that [a_t, b_t] intersect [a_k, b_k]. I simulate this process using Segment Tree again.

Time Complexity O(M log M) with path compression. O(M log N) with Implicit Segment Tree as I did.

Link to my solution

socho · March 17, 2021, 2:12pm

Hey @sawarnik123!

I just wanted to say that I really enjoyed solving this problem, thank you so much for setting it!