CHEFDAG - Editorial

PROBLEM LINK:

Practice
Div-2 Contest
Div-1 Contest

Author: Rahul Dugar
Tester: Suchan Park
Editorialist: William Lin

DIFFICULTY:

Medium

PREREQUISITES:

Graphs, Bipartite Matching

PROBLEM:

Given K permutations of 1 to N, construct a graph with the following properties: 1. Each permutation is a valid topological sort 2. The outdegree of each node is at most 1 3. The number of nodes with indegree 0 is as small as possible.

QUICK EXPLANATION:

The answer will only consist of paths. If we consider the graph with all edges which are allowed by condition 1, we are trying to cover all nodes with the minimum number of paths (with edges from the graph), which can be solved with bipartite matching.

EXPLANATION:

Observation 1. We can model the answer as a forest.

Explanation

Each node has an outdegree of at most 1. If the node has an outdegree of 0, then we will make it the root of a tree. Otherwise, the parent of the node is the node that it points to. Note that the answer is acyclic so we won’t have the case where cycles are formed.

Observation 2. An optimal answer consists of a set of paths.

Proof

We want to minimize the number of nodes with an indegree of 0, which are the leaf nodes. But note that if we have a tree, we can split it into a set of paths as shown, while still having the same number of leaves:

As we only remove edges to convert the tree into a set of paths, conditions 1 (valid topological sort) and 2 (outdegree of at most 1) will still be satisfied.

Each path has exactly one node with 0 indegree, so we want to minimize the number of paths.

Let G contain the set of edges which don’t satisfy condition 1 (all edges except A_{k, j} to A_{k, i} for some k and i<j).

Observation 3. G is acyclic.

Proof

If we only have one permutation A_1, G contains all edges A_{1, i} to A_{1, j} such that i<j, which is acyclic. Adding more permutations only removes edges from G, so G will still be acyclic.

The problem reduces to finding the minimum number of paths to cover G, which is acyclic. This is a well-known problem (https://towardsdatascience.com/solving-minimum-path-cover-on-a-dag-21b16ca11ac0) which can be solved with maximum bipartite matching. The final complexity is O(N^3).

SOLUTIONS:

Setter's Solution
#pragma GCC optimze("Ofast")
#include <bits/stdc++.h>
using namespace std;
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
using namespace __gnu_pbds;
#ifndef rd
#define trace(...)
#endif
#define endl '\n'
#define pb push_back
#define fi first
#define se second
#define int long long
typedef long long ll;
typedef long double f80;
#define double long double
#define pii pair<int,int>
#define pll pair<ll,ll>
#define sz(x) ((long long)x.size())
#define fr(a,b,c) for(int a=b; a<=c; a++)
#define rep(a,b,c) for(int a=b; a<c; a++)
#define trav(a,x) for(auto &a:x)
#define all(con) con.begin(),con.end()
const ll infl=0x3f3f3f3f3f3f3f3fLL;
const int infi=0x3f3f3f3f;
// const int mod=998244353;
int mod=1000000007;
typedef vector<int> vi;
typedef vector<ll> vl;
 
 
typedef tree<pii, null_type, less<pii>, rb_tree_tag, tree_order_statistics_node_update> oset;
auto clk=clock();
mt19937_64 rang(10);
//mt19937_64 rang(chrono::high_resolution_clock::now().time_since_epoch().count());
int rng(int lim) {
	uniform_int_distribution<int> uid(0,lim-1);
	return uid(rang);
}
 
int powm(int a, int b) {
	int res=1;
	while(b) {
		if(b&1)
			res=(res*a)%mod;
		a=(a*a)%mod;
		b>>=1;
	}
	return res;
}
struct BipartiteMatcher{
    vector<vector<int>> G;
    vector<int> L, R, Viz;
    void init(int n, int m){
        G.clear(), L.clear(), R.clear(), Viz.clear();
        G.resize(n), L.resize(n, -1), R.resize(m, -1), Viz.resize(n, 0);
    }
    void AddEdge(int a, int b){ G[a].push_back(b); }
    bool Match(int node){
        if(Viz[node]) return 0;
        Viz[node] = 1;
        for(auto vec : G[node]){
            if(R[vec] == -1 || Match(R[vec])) {
                L[node] = vec;
                R[vec] = node;
                return 1;
            }
        }
        return 0;
    }
    int Solve(){
        bool ok = 1;
        while(ok){
            ok = 0;
            fill(Viz.begin(), Viz.end(), 0);
            fr(i, 0, L.size() - 1)
                if(L[i] == -1)
                    ok |= Match(i);
        }
        int ret = 0;
        fr(i, 0, L.size() - 1)
            ret += (L[i] != -1);
        return ret;
    }
} bm;
bool adj[505][505];
int a[505];
void solve() {
	int n,k;
	cin>>n>>k;
	//at-first any ordering is possible.
	fr(i,1,n)
		fr(j,1,n)
			adj[i][j]=1;
	fr(i,1,k) {
		fr(j,1,n)
			cin>>a[j];
		//removing those edges which cannot be added.
		fr(j,1,n)
			fr(l,j,n)
				adj[a[l]][a[j]]=0;
	}
	bm.init(n+1, n+1);
	fr(i,1,n)
		fr(j,1,n)
			if(adj[i][j])
				bm.AddEdge(i,j); //adding edges to the bipartite matcher.
	cout<<n-bm.Solve()<<endl;
	fr(i,1,n) {
		if(~bm.L[i])
			cout<<bm.L[i]<<" \n"[i==n];
		else
			cout<<0<<" \n"[i==n];
	}
}
 
 
signed main() {
	ios_base::sync_with_stdio(0),cin.tie(0),cout.tie(0);
	srand(chrono::high_resolution_clock::now().time_since_epoch().count());
	cout<<fixed<<setprecision(8);
	int t=1;
	cin>>t;
	while(t--)
		solve();
	return 0;
}
Tester's Solution
#include <bits/stdc++.h>
 
const int BUFFER_SIZE = int(1.1e5);
 
char _buf[BUFFER_SIZE + 10];
int _buf_pos, _buf_len;
 
char seekChar() {
    if(_buf_pos >= _buf_len) {
        _buf_len = fread(_buf, 1, BUFFER_SIZE, stdin);
        _buf_pos = 0;
    }
    assert(_buf_pos < _buf_len);
    return _buf[_buf_pos];
}
 
bool seekEof() {
    if(_buf_pos >= _buf_len) {
        _buf_len = fread(_buf, 1, BUFFER_SIZE, stdin);
        _buf_pos = 0;
    }
    return _buf_pos >= _buf_len;
}
 
char readChar() {
    char ret = seekChar();
    _buf_pos++;
    return ret;
}
 
int readInt(int lb, int rb) {
    char c = readChar();
    int mul = 1;
    if(c == '-') {
        c = readChar();
        mul = -1;
    }
    assert(isdigit(c));
 
    long long ret = c - '0';
    int len = 0;
    while(!seekEof() && isdigit(seekChar()) && ++len <= 19) {
        ret = ret * 10 + readChar() - '0';
    }
    ret *= mul;
 
    assert(len <= 18);
    assert(lb <= ret && ret <= rb);
    return (int)ret;
}
 
void readEoln() {
    char c = readChar();
    //assert(c == '\n');
    assert(c == '\n' || (c == '\r' && readChar() == '\n'));
}
 
void readSpace() {
    assert(readChar() == ' ');
}
 
namespace BipartiteMatching {
    const int MAXN = 505, MAXM = 50005;
    std::vector<int> gph[MAXN];
    int dis[MAXN], l[MAXN], r[MAXM], vis[MAXN];
    void clear(){ for(int i=0; i<MAXN; i++) gph[i].clear();	}
    void add_edge(int l, int r){ gph[l].push_back(r); }
    bool bfs(int n){
        std::queue<int> que;
        bool ok = 0;
        memset(dis, 0, sizeof(dis));
        for(int i=0; i<n; i++){
            if(l[i] == -1 && !dis[i]){
                que.push(i);
                dis[i] = 1;
            }
        }
        while(!que.empty()){
            int x = que.front();
            que.pop();
            for(auto &i : gph[x]){
                if(r[i] == -1) ok = 1;
                else if(!dis[r[i]]){
                    dis[r[i]] = dis[x] + 1;
                    que.push(r[i]);
                }
            }
        }
        return ok;
    }
    bool dfs(int x){
        if(vis[x]) return false;
        vis[x] = 1;
        for(auto &i : gph[x]){
            if(r[i] == -1 || (!vis[r[i]] && dis[r[i]] == dis[x] + 1 && dfs(r[i]))){
                l[x] = i; r[i] = x;
                return true;
            }
        }
        return false;
    }
    int match(int n){
        memset(l, -1, sizeof(l));
        memset(r, -1, sizeof(r));
        int ret = 0;
        while(bfs(n)){
            memset(vis, 0, sizeof(vis));
            for(int i=0; i<n; i++) if(l[i] == -1 && dfs(i)) ret++;
        }
        return ret;
    }
}
 
int sumN = 0;
 
void run() {
    int N = readInt(1, 500);
    sumN += N;
    assert(sumN <= 2000);
    readSpace();
 
    int K = readInt(1, 100);
    readEoln();
 
    std::vector<std::vector<int>> rev;
    rev.reserve(K);
    for(int k = 0; k < K; k++) {
        std::vector<int> pos(N, -1);
        for(int i = 0; i < N; i++) {
            int j = readInt(1, N) - 1;
            if(i+1 < N) readSpace(); else readEoln();
            assert(pos[j] == -1);
            pos[j] = i;
        }
        rev.push_back(pos);
    }
 
    BipartiteMatching::clear();
 
    std::vector<std::vector<bool>> gph(N, std::vector<bool>(N));
    for(int i = 0; i < N; i++) {
        for(int j = 0; j < N; j++) {
            bool good = true;
            for(int k = 0; k < K && good; k++) good &= rev[k][i] < rev[k][j];
            if(good) BipartiteMatching::add_edge(i, j);
        }
    }
 
    int ans = N - BipartiteMatching::match(N);
    printf("%d\n", ans);
    for(int i = 0; i < N; i++) {
        printf("%d%c", BipartiteMatching::l[i] + 1, i+1 < N ? ' ' : '\n');
    }
}
 
int main() {
#ifndef ONLINE_JUDGE
    freopen("input.txt", "r", stdin);
#endif
 
    int T = readInt(1, 100);
    readEoln();
 
    while(T--) {
        run();
    }
    return 0;
}
Editorialist's Solution
#include <bits/stdc++.h>
using namespace std;

const int mxN=500;
int n, k, a[mxN], p[mxN], mt[mxN];
bool adj[mxN][mxN], vis[mxN];

bool dfs(int u) {
	vis[u]=1;
	for(int v=0; v<n; ++v) {
		if(adj[u][v]&&(mt[v]<0||!vis[mt[v]]&&dfs(mt[v]))) {
			mt[v]=u;
			return 1;
		}
	}
	return 0;
}

void solve() {
	//input
	cin >> n >> k;
	//all edges are in G initially, except for i->i
	memset(adj, 1, sizeof(adj));
	for(int i=0; i<n; ++i)
		adj[i][i]=0;
	//go through permutations
	while(k--) {
		for(int i=0; i<n; ++i) {
			cin >> a[i], --a[i];
			//remove edges
			for(int j=0; j<i; ++j)
				adj[a[j]][a[i]]=0;
		}
	}

	//the rest is standard "min path cover on DAG"
	//max matching
	memset(mt, -1, 4*n);
	int flow=0;
	for(int i=0; i<n; ++i) {
		memset(vis, 0, n);
		flow+=dfs(i);
	}

	//answer
	cout << n-flow << "\n";
	for(int i=0; i<n; ++i)
		cout << mt[i]+1 << " \n"[i==n-1];
}

int main() {
	ios::sync_with_stdio(0);
	cin.tie(0);

	int t;
	cin >> t;
	while(t--)
		solve();
}

Please give me suggestions if anything is unclear so that I can improve. Thanks :slight_smile:

10 Likes

Why does the solution need to be a collection of paths? I do not see anywhere that the in-degree of each node has to be no more than 1. Suppose we have a test case

8 3
2 4 1 7 3 8 5 6
2 4 1 3 7 8 5 6
2 4 3 1 8 5 7 6

My solution is

3
7 4 5 1 6 0 6 5

corresponding to a graph with edges directed left to right

2 - 4 - 1 - 7 \
        3 - 5 - 6
        8 /

8 and 3 are both linked to 5, and 7 and 5 both linked to 6. What is wrong with this?

It’s not wrong, I only say that there’s an optimal solution consisting of just paths.

Then I wonder why I got ‘wrong answer’. Perhaps I made an unrelated mistake.

Well, I don’t know of a solution which doesn’t assume that the answer is a collection of paths.

Here is my solution, which I believe to be correct, but which gave ‘wrong answer’ https://www.codechef.com/viewsolution/30498490

I created class PackedBits simply to speed up the handling of operations on each array of links to or from a node.

I linked it in the prerequisites section

1 Like

What does RemoveOutLinksBFS do exactly, and why is it correct?

Thank you for looking at my solution. I believe RemoveOutLinksBFS to be correct because I have worked through it for a few examples. This is how it works.

At entry to RemoveOutLinksBFS we have all the links between nodes which are allowed by the various permutations. The links are stored packed in instances of PackedBits,from where we can extract by either of the following two functions

    public int SingleItem()
    // Return single item, = 0 none, = -1 more than one

    public List<int> Items()
    // All items which are set, empty if none

In this function I want to remove links so that each node has no more than one OutLink (OutDegree <= 1) and the number of nodes with InDegree 0 is a minimum.

The algorithm (which I devised myself) is as follows.

Prepare to work through the nodes by BFS.
First build a queue containing those nodes with InDegree 0.
Then work through the queue until it is empty.

If the current item has InDegree more than 1, it has been dealt with already, so go on to the next node in the queue.

Get all the OutLinks from this node. If there are none, do nothing and go on to the next item. If there is more than one outlink, look for any node to which the current node is linked which has InDegree 1. If there is more than one such node, choose the first one encountered, arbitrarily. If there are none, choose the first node to which linked, again arbitrarily.

Remove all the links other than the chosen one. While removing the links, if any of the nodes to which we are removing the link has its InDegree reduced to 0, add this node to the queue.

If the remaining node to which the current node is linked has InDegree of 1, and it is not already in the queue, add it to the queue.

I don’t see why it would be optimal

The requirement is to minimise the number of nodes with InDegree 0, while removing links so that no node has OutDegree more than 1.

When deciding which links to remove, I work from left to right by BFS, starting with the nodes with InDegree 0. At each node I check the InDegree of all nodes to which this node is linked. If I delete any link to a node with InDegree 1 the InDegree of the target node will be reduced to 0, a situation I wish to avoid wherever possible. So on finding such a link I keep it, and delete the others. In doing so I may reduce the InDegree of other target nodes to 0, but this is unavoidable.

I do not see how this method fails to produce the required result, but I may have missed something.

I also tried some greedy algorithms, but there would always be some counterexample that broke everything. I now believe there is no correct greedy solution here, for two reasons:

  1. Meta-reason: This problem can be reduced to bipartite matching (I did this by splitting each node into an “out” part and an “in” part), and it doesn’t seem to be much easier than the general bipartite matching problem. I think if there was a valid greedy algorithm, somebody would’ve discovered and published a similarly greedy approach for bipartite graphs, which does not seem to be the case.
  2. Real reason: Greedy algorithms that only consider degrees would never know which node of indegree 1 to choose. You write:

If there is more than one such node, choose the first one encountered, arbitrarily.

This is not optimal. Choosing a wrong edge can prevent it from picking two correct edges instead. Consider a graph like this (note that it can be represented in the input format):

Screenshot from 2020-03-27 12-04-27

If I read your explanation correctly, the algorithm might happen to start at node 2, and then arbitrarily pick the edge 2–>6, which will make the node 3 useless and produce a suboptimal solution.

6 Likes

Thank you for your explanation, with the example. I had been unable to think of an example which would defeat my method. In writing my (faulty) method, I have never heard of bipartite matching.

1 Like

Thanks @tmwilliamlin for writing simple and easy to understand code :slightly_smiling_face:

1 Like

It’s size n. In memset you have to give the size of the array, and each int is 4 bytes.

1 Like