GRDSTD - Editorial

PROBLEM LINK:

Practice
Div-1 Contest

Author & Editorialist: Md Sabbir Rahman
Tester: Radoslav Dimitrov

DIFFICULTY:

SUPER-HARD

PREREQUISITES:

Geometry, Dynamic Programming, DAG, Persistent Data structures

PROBLEM:

We are given n points on the two-dimensional plane. Among all possible convex polygons that can be made by using a subset of the points as vertices, output the areas of the k largest ones.

Quick Explanation

For subtask 1 and 2 we can use priority queue and convex hull to output the areas, see below for full explanation. For full constraints, we can sort all possible directed segments from the points with respect to their slopes, then we can use a dynamic programming approach by processing those segments, which gives us the largest area of the convex polygon. This DP gives rise to a weighted DAG on which we have to find the k longest paths, we can do that with some persistent data structures and a priority queue.

Explanation

In the following sections, by area, we mean twice the area, since that is easier to maintain. Also if we output all possible areas and k areas haven’t been outputted, we output -1 for the remaining values.

Subtask 1

For subtask 1, k was 2. The largest area is definitely the convex hull of the set of points. For the second-largest area, we can try to remove a point from the convex hull’s boundary and redo the convex hull. Since these one-point removed convex hulls will contain all other convex polygons that can be made, one of them must be the second largest.
The complexity of the solution is O(n^2 \log n).

Subtask 2

In subtask 2, k <= 1000. Let’s try to extend the idea of subtask 1. We start with the convex hull of the whole set of points and then put it in a priority queue. The priority queue contains sets of points and is keyed on their area. At each stage, we pick the set of points S at the priority queue’s top and output its area. After that, we try to remove a single point from this set, creating S' and compute the convex hull of S'. If the convex hull is new (which we track by maintaining a set containing convex hulls seen so far), then we put S' in the priority queue again. This method ensures that smaller polygons come later, so, we can output k largest areas by checking the priority queue k times.

After popping the top set from the priority queue, we have to check its subsets by removing one point, there are O(n) such choices for each set. So the overall complexity of the solution is O(k n^2 \log k).

Subtask 3

For subtask 3, k was 1000000, so we can use a priority queue. However, we can’t spend too much time on each of the convex polygons we get. So the previous methods aren’t helpful. We have to find another way to look at all possible convex polygons.

Consider a dynamic programming approach that gives us the largest possible convex polygon that can be made. We create O(n^2) directed segments that can be made from the points and sort them by angle. Now we process this segments while maintaining a DP table dp[i][j] = largest open area of the polyline that starts at point i and ends at j. Here open area means that polylines may not be closed yet, however, we have added the contribution to the total final area of each of the edges (shoelace formula). Due to the order of the segments, the polylines created are always convex. After the DP is done, we can look at all dp[i][i] to find the largest area possible. This takes O(n^3) time. Note that since 3 points can be collinear in the input, we need to make the comparator for segment sorting handle that when two segments have the same direction.

But we could have easily done this with convex hull, so what is the use of this DP? the use is that if we were not always optimally updating the DP states, then in some order of updating, Each possible convex polygon would come up. In other words, if we can build the weighted Directed Acyclic Graph that is represented by the DP transitions, then any path from initial dp[i][i] to final dp[i][i] represents a convex polygon built from the points. This DAG might initially seem O(n^4) in size, as each node represents (i, j, k) which represents a polyline starting at point i and ending at point j and is made from first k sorted segments. But since our initial DP has n^3 transitions, so the DAG can be compressed to be O(n^3) size. See the solution codes for more details. To make the implementation cleaner we add two special nodes source and sink and add 0 weight edges from source to all initial dp[i][i] nodes and 0 weight edges from all final dp[i][i] nodes to sink. Then any convex polygon is represented by some path from source to sink node.

So our question is now that we have a weighted DAG G and we have to find the k largest path costs from source node to sink. The DP that we did was kind of like a Dijkstra we did on the DAG (for longest path), So we can say that for each node u we know the node par(u) such that for the longest path from source to u, we have to come to par(u) then use the edge par(u) \rightarrow u. These edges form an optimal tree on the DAG. and let us call these edges optimal edges. Any path from source to sink can be represented as following the optimal tree for some time then taking some non-optimal edges and then following the optimal tree again and so on. In fact, before the first non-optimal edge, the path can be implicitly defined to be part of the optimal tree. So our goal is to somehow enumerate the non-optimal edge sequences to sink, sorted by the weights of the path the sequence represents. If we can do that quickly, we’ll solve the problem.

The next obstacle is, given a sequence of non-optimal edges, how do we get the weight of the full path represented by the sequence? For all non-optimal edges u \rightarrow v we define a new weight as longest(v) - longest(u) - cost(u, v), where longest(u) = longest path weight from source to u and cost(u, v) = original weight of the edge. This new weight is kind of like a penalty for using the edge, and adding these penalties for the sequence of non-optimal edges gives the total penalty or difference from the longest path that we could’ve taken. So we have to order the sequences by the sum of the edges’ new weight.

The overall algorithm is this, We start from the empty sequence and put it on a priority queue. The sequences will be built backwards from the sink. The priority queue sorts the sequences by their total penalty. Now when we pop from the priority queue we get a sequence of non-optimal edges. After printing its area, we need to extend this sequence by picking the previous non-optimal edge to be prepended to the sequence at the beginning. For this, we need to look at all the non-optimal edges from which we can reach current sequences’ beginning using only optimal edges. We maintain this for all nodes via persistent segment tree. For each node u, heap(u) represents all the non-optimal edges from which u can be reached via optimal edges only. The persistent segment tree keeps the edges sorted by new weight, so that we only extend one sequence at a time. For each sequence A we extend it at most two times, once to prepend a new non-optimal edge, and once to extend the sequence B from which the current sequence came (i.e. take the next edge from the beginning of B).

Complexity of the solution is O(n^3 \log n + k (\log n + \log k)).

Alternate Solution

Other persistent data structures can be used instead of persistent segment trees, see the testers solution for a persistent randomized heap. Also since any path in the DAG is at most n length long, persistent data structure could be avoided with higher complexity, which is a little bit hard to get accepted. But by using heuristics it is possible to make the code quite fast.

Setter's Solution
#include <bits/stdc++.h>
using namespace std;
typedef long long int ll;

const int nmax = 125;

struct point{
    ll x, y;
    point(ll _x = 0, ll _y = 0): x(_x), y(_y) {}
    point operator-(const point &q) const{
        return point(x - q.x, y - q.y);
    }
    bool operator<(const point &q) const{
        return x == q.x? (y<q.y) : (x<q.x);
    }
};

inline ll cross(point p, point q){
    return p.x * q.y - p.y * q.x;
}

inline ll dot(point p, point q){
    return p.x * q.x + p.y * q.y;
}

inline bool half(const point &p){
    return p.y > 0 || (p.y == 0 && p.x < 0);
}

ll area(point &a, point &b, point &c){
    return cross(b-a, c-b);
}

int id[nmax][nmax], cnt;

ll memo[nmax*nmax*nmax];
vector<pair<ll, int>> adj[nmax*nmax*nmax];

/// angular sort all directed segments
/// and do a dp on state (i, j) ->
/// all polyline starting at point i and ending at point j
/// using segments processed so far
/// Also build the underlying DAG
ll dp(vector<point> &input){
    int n = (int) input.size();

    vector<pair<int, int>> segments;
    for(int i = 0; i<n; i++)
        for(int j = 0; j<n; j++)
            if(i != j) segments.emplace_back(i, j);

    sort(segments.begin(), segments.end(), [&](auto &p, auto &q){
            point v = input[p.second] - input[p.first];
            point w = input[q.second] - input[q.first];
            return make_tuple(half(v), 0ll, dot(v, input[q.second] - input[p.second])) <
                   make_tuple(half(w), cross(v, w), 0ll);
        });

    cnt++;

    for(int i = 0; i<n; i++){
        for(int j = 0; j<n; j++){
            id[i][j] = cnt;
            memo[cnt] = -1e17;
            adj[cnt].emplace_back(-1e17, 0);
            cnt++;
        }
    }

    for(auto &p : segments){
        int a = p.first, b = p.second;
        for(int i = 0; i<n; i++){

            memo[cnt] = -1e17;

            if(i == a){
                adj[cnt].emplace_back(0, 0);
                memo[cnt] = 0;
            }

            int prv = id[i][b];
            adj[cnt].emplace_back(0, prv);
            memo[cnt] = max(memo[cnt], memo[prv]);

            int from = id[i][a];
            ll weight = area(input[i], input[a], input[b]);
            adj[cnt].emplace_back(weight, from);
            memo[cnt] = max(memo[cnt], weight + memo[from]);

            id[i][b] = cnt;
            cnt++;
        }
    }

    for(int i = 0; i<n; i++){
        adj[cnt].emplace_back(0, id[i][i]);
        memo[cnt] = max(memo[cnt], memo[id[i][i]]);
    }
    return memo[cnt];
}


namespace PST{
    struct Node{
        Node *lc, *rc;
        int cnt;
        Node(int _cnt = 0): cnt(_cnt), lc(NULL), rc(NULL){}
        Node(Node *n){
            cnt = n->cnt, lc = n->lc, rc = n->rc;
        }
    };

    Node *build(int l, int r){
        if(l == r){
            Node *tmp = new Node(0);
            return tmp;
        }
        Node *n = new Node();
        int mid = (l+r)>>1;
        n->lc = build(l, mid);
        n->rc = build(mid+1, r);
        n->cnt = n->lc->cnt + n->rc->cnt;
        return n;
    }

    Node *update(Node *n, int l, int r, int k){
        if(l == r){
            Node *tmp = new Node(n->cnt + 1);
            return tmp;
        }
        Node *t = new Node(n);
        int mid = (l+r)>>1;
        if(k <= mid)
            t->lc = update(n->lc, l, mid, k);
        else
            t->rc = update(n->rc, mid+1, r, k);
        t->cnt = t->lc->cnt + t->rc->cnt;
        return t;
    }

    int query(Node *n, int l, int r, int k){
        if(l == r)
            return l;
        int mid = (l+r)>>1;
        if(n->lc->cnt >= k)
            return query(n->lc, l, mid, k);
        else
            return query(n->rc, mid+1, r, k-n->lc->cnt);
    }
};

PST::Node *heaps[nmax*nmax*nmax];
pair<ll, int> edges[nmax*nmax*nmax];
int E = 0;

/// build the edges with new weights
void buildEdges(){
    for(int i = 1; i<=cnt; i++){
        for(int j = 0; j<adj[i].size(); j++){
            auto &p = adj[i][j];
            p.first = memo[p.second] + p.first - memo[i];
            if(p.first == 0) swap(adj[i][j], adj[i][0]);
        }
        for(int j = 1; j<adj[i].size(); j++){
            if(memo[adj[i][j].second] < 0) continue;
            edges[E++] = adj[i][j];
        }
    }
    edges[E++] = {0, cnt};
    sort(edges, edges+E);
}

/// build heaps (persistent segment trees) on each node
void buildHeaps(){

    heaps[0] = PST::build(0, E-1);

    for(int i = 1; i<=cnt; i++){
        if(memo[i] < 0) continue;
        if(adj[i].empty()) continue;
        heaps[i] = heaps[adj[i][0].second];
        for(int j = 1; j<adj[i].size(); j++){
            if(memo[adj[i][j].second] < 0) continue;
            int ind = int(lower_bound(edges, edges+E, adj[i][j]) - edges);
            heaps[i] = PST::update(heaps[i], 0, E-1, ind);
        }
    }
}

/// use priority queue to find kth longest path on the DAG
void kthPath(int k){
    priority_queue<tuple<ll, int, int, int>> pq;
    int eid = int(lower_bound(edges, edges+E, make_pair(0ll, cnt)) - edges);
    pq.emplace(memo[cnt], -1, eid, 0);

    bool first = true;
    while(!pq.empty() && k > 0){
        int from, meEdge, ecnt;
        ll d;
        tie(d, from, meEdge, ecnt) = pq.top();
        pq.pop();
        int me = edges[meEdge].second;

        if(d == 0) break;

        if(first) first = false;
        else cout<<" ";

        cout<<d;
        k--;

        if(ecnt > 1){
            eid = PST::query(heaps[from], 0, E-1, ecnt-1);
            pq.emplace(d - edges[meEdge].first + edges[eid].first, from, eid, ecnt-1);
        }

        int tmp = heaps[me]->cnt;
        if(tmp > 0){
            eid = PST::query(heaps[me], 0, E-1, tmp);
            pq.emplace(d + edges[eid].first, me, eid, tmp);
        }
    }

    for(int i = 0; i<k; i++) {
        cout<<" "<<-1;
    }
    cout<<endl;
}




int main(){
    ios::sync_with_stdio(false);
    cin.tie(NULL);
    cout.tie(NULL);

    int n, k;
    cin>>n>>k;

    assert(3 <= n && n <= 120);
    assert(1<= k && k <= 1000000);

    set<pair<int, int>> st;
    vector<point> input(n);
    for(int i = 0; i<n; i++){
        cin>>input[i].x>>input[i].y;

        assert(int(-1e6) <= input[i].x && input[i].x <= int(1e6));
        assert(int(-1e6) <= input[i].y && input[i].y <= int(1e6));

        st.insert({input[i].x, input[i].y});
    }

    assert(st.size() == n);


    dp(input);
    buildEdges();

    buildHeaps();
    kthPath(k);
}
Tester's Solution
#include <bits/stdc++.h>
#define endl '\n'

#define SZ(x) ((int)x.size())
#define ALL(V) V.begin(), V.end()
#define L_B lower_bound
#define U_B upper_bound
#define pb push_back

using namespace std;
template<class T, class T1> int chkmin(T &x, const T1 &y) { return x > y ? x = y, 1 : 0; }
template<class T, class T1> int chkmax(T &x, const T1 &y) { return x < y ? x = y, 1 : 0; }
const int MAXN = (1 << 7);
const int MAX_NODES = 1 << 22;
const int64_t inf = (int64_t)1e18;

mt19937 mt(42);

struct node {
    int64_t val, lazy;
    node *l, *r;

    node() { l = r = nullptr; val = 0; lazy = 0; }
    node(int64_t _v) {
        val = _v;
        lazy = 0;
        l = r = nullptr;
    }
};

using pnode = node*;

void push(pnode &t) {
    if(!t) return;
    if(t->lazy) {
        t->val += t->lazy;
        if(t->l) {
            t->l = new node(*t->l);
            t->l->lazy += t->lazy;
        }

        if(t->r) {
            t->r = new node(*t->r);
            t->r->lazy += t->lazy;
        }

        t->lazy = 0;
    }
}

pnode merge(pnode a, pnode b) {
    push(a);
    push(b);

    if(!a) return new node(*b);
    if(!b) return new node(*a);

    if(a->val < b->val) swap(a, b);

    pnode ret = new node(*a);
    if(mt() % 2) {
        ret->r = merge(ret->r, b); 
    } else {
        ret->l = merge(ret->l, b);
    }

    return ret;
}

struct PT {
    int x, y;
    PT() { x = y = 0; }
    PT(int _x, int _y) {
        x = _x;
        y = _y;
    }

    PT operator-(PT other) {
        return PT(x - other.x, y - other.y);
    }

    PT operator+(PT other) {
        return PT(x + other.x, y + other.y);
    }

    int64_t operator%(PT other) {
        return x * 1ll * other.y - other.x * 1ll * y;
    }
};

struct state {
    int64_t score;
    pnode t;

    state() {
        score = 0;
        t = nullptr;
    }

    state(int64_t v, pnode _t) {
        score = v;
        t = _t;
    }

    friend bool operator<(const state &current, const state &other) {
        return current.score < other.score;
    }
};

int n, k;
PT p[MAXN];

void read() {
    cin >> n >> k;
    for(int i = 0; i < n; i++) {
        cin >> p[i].x >> p[i].y;
    }
}

int64_t sq(PT p) {
    return p.x * 1ll * p.x + p.y * 1ll * p.y;
}

inline int quad(PT p) {
    if(p.x < 0 and p.y < 0) return 0;
    if(p.x >= 0 and p.y < 0) return 1;
    if(p.x >= 0 and p.y >= 0) return 2;
    if(p.x < 0 and p.y >= 0) return 3;
    assert(false);
}

inline bool half(const PT &p){
    return p.y > 0 || (p.y == 0 && p.x < 0);
}

int64_t dot(PT a, PT b) { return a.x * 1ll * b.x + a.y * 1ll * b.y; }

bool cmp_polar(pair<PT, pair<int, int>> f, pair<PT, pair<int, int>> s) {
    PT v = f.first;
    PT w = s.first;
    return make_tuple(half(v), 0ll, dot(v, p[s.second.second] - p[f.second.second])) <
                   make_tuple(half(w), v % w, 0ll);
}
    /*int qf = quad(f.first);
    int qs = quad(s.first);
    if(qf != qs) return qf < qs;
    if(f.first % s.first == 0) return sq(f.first) < sq(s.first); 
    return qf == qs ? (f.first % s.first > 0) : qf < qs;
}

bool cmp_polar(pair<PT, pair<int, int>> f, pair<PT, pair<int, int>> s) {
    return atan2l(f.first.y, f.first.x) < atan2l(s.first.y, s.first.x);
}*/

int main_node[MAXN][MAXN], sink_nodes[MAXN];
vector<pair<int, int64_t>> adj[MAX_NODES];
int64_t dp[MAX_NODES];

int psz;
int new_node() { 
    dp[psz] = -inf;
    return psz++; 
}

bool used[MAX_NODES];
pnode heaps[MAX_NODES];

void dfs(int u) {
    used[u] = true;
    heaps[u] = nullptr;

    for(auto e: adj[u]) {
        int v = e.first;
        int64_t w = dp[u] - dp[v] + e.second;
    
        if(!used[v]) dfs(v);

        if(heaps[v]) {
            pnode vcopy = new node(*heaps[v]);
            vcopy->lazy += w;
            push(vcopy);
            heaps[u] = merge(heaps[u], vcopy);
        }	
    }
}

void solve() {
    vector<pair<PT, pair<int, int>>> segments;
    for(int i = 0; i < n; i++) {
        for(int j = i + 1; j < n; j++) {
            segments.pb({p[j] - p[i], {i, j}});
            segments.pb({p[i] - p[j], {j, i}});
        }
    }

    psz = 0;
    int S = new_node(), T = new_node();
    dp[S] = 0;
    stable_sort(ALL(segments), cmp_polar);

    for(int i = 0; i < n; i++) {
        sink_nodes[i] = new_node();
        adj[sink_nodes[i]].pb({T, 0});
        for(int j = 0; j < n; j++) {
            main_node[i][j] = new_node();
            dp[main_node[i][j]] = (i == j) ? 0 : -inf;
            if(i == j) adj[S].pb({main_node[i][j], 0});
        }
    }

    for(auto it: segments) {
        int i = it.second.first, j = it.second.second;
        int64_t w = p[i] % p[j];
        for(int prv = 0; prv < n; prv++) {
            int nd = main_node[prv][i];
            if(dp[nd] != -inf) {				
                if(prv != j) {
                    int nw = new_node();
                    adj[main_node[prv][j]].pb({nw, 0});
                    adj[nd].pb({nw, w});
                    dp[nw] = max(dp[main_node[prv][j]], dp[nd] + w);
                    main_node[prv][j] = nw;
                } else {
                    chkmax(dp[sink_nodes[j]], dp[nd] + w); 					
                    adj[nd].pb({sink_nodes[j], w});
                }
            }
        }
    }

    used[T] = true;
    heaps[T] = new node(0);
    for(int i = 0; i < n; i++) {
        chkmax(dp[T], dp[sink_nodes[i]]);
    }

    dfs(S);

    priority_queue<state> pq;
    push(heaps[S]);
    pq.push(state(0, heaps[S]));

    while(!pq.empty() && k > 0) {
        int64_t area = pq.top().score + dp[T];
        pnode t = pq.top().t;
        pq.pop();

        if(area == 0) {
            break;
        }

        cout << area << " ";
        k--;

        push(t->l);
        push(t->r);
        if(t->l) pq.push(state(t->l->val, t->l));
        if(t->r) pq.push(state(t->r->val, t->r));
    }

    while(k) {
        k--;
        cout << -1 << " ";
    }

    cout << endl;
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(nullptr);

    read();
    solve();
    return 0;
}

VIDEO EDITORIAL: