PROBLEM LINK:
Author: Ildar Gainullin
Tester: Radoslav Dimitrov
Editorialist: Srikkanth
DIFFICULTY:
Easy-Medium
PREREQUISITES:
Minimum Spanning Tree algorithms
PROBLEM:
Given a n points in D dimensional space, find the maximum spanning tree of the complete graph with edge weights as manhattan distance between points.
QUICK EXPLANATION:
In the complete graph there are a lot of edges that will never be a part of any spanning tree.
We can identify 2^d points such that every edge that there will definitely exist a maximum spanning tree such that every edge of the spanning tree contains one of these 2^d points as an end point.
So we need not consider all n^2 graphs, rather only those edges that have atleast one end point in one of the 2^d points. We can then apply any maximum spanning tree algorithm on the smaller graph to obtain the result.
EXPLANATION:
First observation is we can apply any minimum spanning tree algorithm and change min to max and solve this problem.
We can construct the complete graph and easily solve the first subtask.
Recap of Prim’s algorithm
Suppose we have obtained some set of connected edges that we are sure lies in some minimum spanning tree.
Consider the points that belong to end points of edges visited so far, (say A)
Let the other points be B.
Consider all edges from some point in A and some point at B. Among these edges, atleast one edge must belong to the spanning tree so we can greedily choose the minimum (or maximum) to get the minimum (or maximum) spanning tree.
Initially A is any point we want, and we keep repeating the above until B is empty.
Let’s see if we can use any properties of the manhattan distance to speed up the algorithm
We cannot afford to construct all edges (since it’s already \mathcal{O}(n^2)).
However we can remove some redundant edges and still obtain the required Maximum Spanning Tree.
Suppose we are at some stage of Prim’s algorithm, we have the set A of visited nodes and complimentary set B of unvisited nodes.
Let’s look at the edge weight between a pair of points p_1 \in A and p_2 \in B.
d(p_1, p_2) = \sum_{i=1}^{d} |p_{1i} - p_{2i}|
Let’s consider the value |a - b|, we can rewrite it as a - b or b - a according to whether a \geq b or a \leq b
If we didn’t have the absolute value, we can combine the p_1 terms and the p_2 and split them.
Since we only have to find the maximum value of the sum of absolute values, let’s consider all 2^d combinations of each expression and take maximum of all of them. The maximum would definitely contain all d terms to be positive, and would be our actual edge weight.
Consider the 2^d possible expansions of d(p1, p2), i.e. with each summand as p_{1i} - p_{2i} or p_{2i} - p_{1i}.
The removes the absolute values and we can group all p_{1i} terms together and p_{2i} terms together.
We’ll obtain 2^d possible values for every point (according to whether each coordinate is positive or negative), let’s denote that by p_{i, mask}
The maximum among these 2^d terms p_{i, mask} - p_{j,mask} will be equal to d(p_1, p_2).
Let’s rewrite the formula for d(p_1, p_2),
d(p_1, p_2) = \max_{mask=0}^{2^d-1} p_{1, mask} - p_{2, mask}, and equivalently
d(p_1, p_2) = \max_{mask=0}^{2^d-1} p_{1, mask} + p_{2, \overline{mask}} where \overline{mask} is 1’s complement of mask.
We will always want to choose the point p_i that maximises p_{i, mask} and similarly p_{j, \overline{mask}}.
Let the point M_{mask} be the point that has maximum sum for the combination mask.
Consider the points M_{mask} and M_{\overline{mask}}.
- If both lie on the set A, then d(M_{mask}, p_2) \geq d(p_1, p_2)
- If both lie on the set B, then d(p_1, M_{\overline{mask}})\geq d(p_1, p_2)
- If one lies in A and other in B, then d(M_{mask}, M_{\overline{mask}}) \geq d(p_1, p_2)
So we need not consider the edge p_1, p_2 if neither of them are one of the points that maximise mask or \overline{mask}.
There are only 2^d maximum points (one for each point) and we need to consider only those edges that have atleast one end point among these 2^d points, because others will not be candidates for the maximum as per the argument shown above.
We have reduced the number of edges from n^2 to n 2^d, which is enough to solve the problem.
TIME COMPLEXITY:
TIME: \mathcal{O}(n 2^d (d + \log n))
SPACE: \mathcal{O}(n 2^d)
SOLUTIONS:
Setter's Solution
#include <cmath>
#include <functional>
#include <fstream>
#include <iostream>
#include <vector>
#include <algorithm>
#include <string>
#include <set>
#include <map>
#include <list>
#include <time.h>
#include <math.h>
#include <random>
#include <deque>
#include <queue>
#include <cassert>
#include <unordered_map>
#include <unordered_set>
#include <iomanip>
#include <bitset>
#include <sstream>
#include <chrono>
#include <cstring>
using namespace std;
typedef long long ll;
#ifdef iq
mt19937 rnd(228);
#else
mt19937 rnd(chrono::high_resolution_clock::now().time_since_epoch().count());
#endif
const int N = 2e5 + 7;
int dsu[N];
int get(int v) {
if (v == dsu[v]) return v;
else return dsu[v] = get(dsu[v]);
}
void uni(int u, int v) {
dsu[get(u)] = get(v);
}
int main() {
#ifdef iq
freopen("a.in", "r", stdin);
#endif
ios::sync_with_stdio(0);
cin.tie(0);
int n, d;
cin >> n >> d;
vector <vector <int> > x(n, vector <int> (d));
for (int i = 0; i < n; i++) for (int j = 0; j < d; j++) cin >> x[i][j];
vector <pair <int, pair <int, int> > > e;
auto add_edge = [&] (int u, int v) {
int sum = 0;
for (int i = 0; i < d; i++)
sum += abs(x[u][i] - x[v][i]);
e.push_back({sum, {u, v}});
};
for (int mask = 0; mask < (1 << d); mask++) {
vector <pair <int, int> > e;
for (int i = 0; i < n; i++) {
int sum = 0;
for (int j = 0; j < d; j++) {
if ((mask >> j) & 1) {
sum += x[i][j];
} else {
sum -= x[i][j];
}
}
e.push_back({sum, i});
}
int s = min_element(e.begin(), e.end())->second;
int t = max_element(e.begin(), e.end())->second;
for (int x = 0; x < n; x++) {
add_edge(s, x);
add_edge(t, x);
}
}
sort(e.rbegin(), e.rend());
for (int i = 0; i < n; i++) dsu[i] = i;
ll ans = 0;
for (auto c : e) {
int u = c.second.first, v = c.second.second;
if (get(u) != get(v)) {
uni(u, v);
ans += c.first;
}
}
cout << ans << '\n';
}