NRWP - Editorial


Div-2 Contest
Div-1 Contest

Author & Editorialist: carre
Tester: Smelskiy




Max-flow min-cut theorem


Given N particles located at integers positions (x_i,y_i) of a grid and being E the set of pairs of particles located at adjacent positions, the problem ask us to choose a sign to each of the particles’ value p_i in order to maximize the total grid value given by

V = \sum\limits_{(i,j)\in E} p_ip_j \;+\; \sum\limits_{i=1...N} p_i \, h(x_i,y_i)\qquad\qquad [1]

where h(x_i,y_i) is the value of the cell containing particle i.


The problem can be transformed to a minimum cut problem where each of the connected components defined by the cut corresponds to the set of particles with the same sign assigned.


Let’s see that we can transform the problem into a minimum cut one.

Consider an undirected graph where each node corresponds to a particle and the edges connect particles located at adjacent cells. The value of the edge connecting particles i and j is given by \lvert p_i\, p_j\rvert. The graph we build this way includes so far the “interactions” between particles (first term of expression 1).
We need to take into account also the contribution of the second term. To do that, let’s consider two additional nodes, s and t, the former connected to every node located at position with non-negative h, the latter connected to every node located at position with non-positive h. The value given to each of these new edges is \lvert p_i \, h(x_i,y_i)\rvert.
As you can see, the values of the edges corresponds to the absolute value of the summands of expression that defines V.

If we define A = E \, \cup \, \lbrace s,t\rbrace , that is, the union of the pairs of adjacent particles and the nodes s and t, we can transform V in the following way

V = \sum\limits_{(i,j) \in A} J_i J_j \qquad\qquad \left\lbrace \begin{matrix} J_i = p_i& \text{ if }i \text{ is a particle}\\ J_i = h_j& \text{ if }i \in \lbrace s,t \rbrace\end{matrix} \right.

where h_j should be understood as h(x_j,y_j).

Let’s classify our nodes in two sets:

S = \lbrace \text{\small{nodes that contribute to a positive value }}J_i \rbrace\\ T = \lbrace \text{\small{nodes that contribute to a negative value }}J_i \rbrace

S contains node s and every node to which we assign a positive sign. T contains node t and every node to which we assign a negative sign.

Now V can be expressed as:

V = \sum\limits_{(i,j)\in A} \lvert J_i J_j \rvert \,-\, 2 \sum\limits_{i \in S ;\, j\in T} \lvert J_i J_j\rvert\qquad\qquad[2]

Where we have used the fact that only the terms that combine one element of S with one element of T will contribute with negative product. As we have summed these terms at first, we should subtract them twice.

In order to choose signs of J_i's that maximize V, the first term of last expression is irrelevant because it sums up over all nodes and takes absolute values. So, our goal is to minimize the second term, that depends on which particles belong to S and which ones to T.

As we said, the only contributions to the second term come from interactions between one of the nodes of S with one of the nodes of T, and those interactions are represented in our graph by the values of the edges. So, the problem of minimizing the second term of equation 2 is equivalent to the problem of finding a minimum cut that splits the graph in two components.

The Max Flow - Min Cut Theorem states that the minimum cut is equal to the max flow of the network. So we can arrive to the answer of our problem by using Edmond-Karp or Dinic’s algorithms to get the max flow (min cut) and replace it in equation 2.

Finally, in order to find the sign of each particle, once he have calculated the max flow, we can employ DFS over the residual graph to find the nodes that belong to the S component (or T). More specifically, all the nodes that we should associate positive values of p are reachable from s considering only edges with positive residual capacity.

With some minor modifications, this problem is used to model ferromagnetic (and antiferromagnetic) behaviours in materials. The V function corresponds to -Energy of the system, the p values to magnetic moments of particles and h to the position-dependent value of an external magnetic field. Maximizing V implies minimizing the energy of the system. A large grid with few particles simulates a diluted ferromagnetic material.
You can further read on the Ising model here.


Subtask 1

The value of N is so small that you can try the 2^N combinations of signs and keep the one that maximize V.

Subtask 2

A dynamic programming approach is possible, given the constraints. You can scan the grid from left to right. Every column has 2^P signs combinations, where P is the number of particles in that column (0 \le P \le H).
The dp state dp[i][j] can store the value of V if you consider up to column i and the index j is used to identify each one of the 2^P possible combinations of signs. The transition from column i-1 to column i requires to calculate the value of V for each of the 2^{P_i}*2^{P_{i-1}} combinations of signs.


Setter's Solution
#include <bits/stdc++.h>
using namespace std;
typedef vector<int> vi;
#define dbg(x) cerr << #x"= " << x << endl

const int INF=1E9;
const int GRDSZ=1002; // maximum grid size + 2
const int PARTICLES=201; // maximum particles number + 1
int h[GRDSZ][GRDSZ];  // stores the value's of the cells
int x[PARTICLES]; // stores the columns of the particles
int y[PARTICLES]; // stores the row of the particles
int p[PARTICLES]; // stores the absolute value of the particles
int up[PARTICLES];   // stores the sign given to each particle's value
int H;  // height of the grid
int W; // width of the grid
int N; // number of particles

namespace dinic { // dinic's algorithm to find max flow of a network
    struct edge {int to, rev, f, cap;};

    struct graph {
        int NN,src,dest;
        vi dist, q, work;
        vector<vector<edge> > g;

        graph(int n):NN(n){dist.resize(n);q.resize(n);work.resize(n);g.resize(n);}

        void add_edge(int s, int t, int cap) {
            edge a{t, (int)g[t].size(), 0, cap};
            edge b{s, (int)g[s].size(), 0, cap};

        bool bfs() {
            dist[src] = 0;
            int qsz = 0;
            q[qsz++] = src;
            for (int iq = 0; iq < qsz; ++iq) {
                int u = q[iq];
                for (auto &e : g[u]) {
                    int v =;
                    if (dist[v] < 0 && e.f < e.cap) {
                        dist[v] = dist[u] + 1;
                        q[qsz++] = v;
            return dist[dest] >= 0;

        int dfs(int u, int f) {
            if (u == dest) return f;
            for (int &i = work[u]; i < (int)g[u].size(); ++i) {
                edge &e = g[u][i];
                if (e.cap <= e.f) continue;
                int v =;
                if (dist[v] == dist[u] + 1) {
                    int df = dfs(v, min(f, e.cap - e.f));
                    if (df > 0) {
                        e.f += df;
                        g[v][e.rev].f -= df;
                        return df;
            return 0;

        int mf(int s, int t) {
            src = s; dest = t;
            int ret = 0;
            while (bfs()) {
                while (int delta = dfs(s, INF))
                    ret += delta;
            return ret;

void dfs_up(int node, const dinic::graph &G){
    for(const auto &e : G.g[node]){
        if (e.cap > e.f && up[]==-1)

void solve_case(){
    dinic::graph G(N+2); // network flow graph (source=0; dest = N+1)

    int V=0; // stores de value of the V function as defined in statement
    // we will first sum the contributions of abs(h_i * p_i) and abs(p_i * p_j) and then substract the negative ones

    for(int i=1;i<=N;++i){
        V += abs(h[y[i]][x[i]])*p[i];    // interaction between particle and cell
        if (h[y[i]][x[i]] >= 0) // if the cell has non-negative h, connect particle to source
        if (h[y[i]][x[i]] <= 0) // if the cell has non-positve h, connect particle to dest
        for(int j=i+1;j<=N;++j){ // we scan all other particles to find adjacent (N <= 200, so we can doit in N*N time)
            if (abs(y[i]-y[j]) + abs(x[i]-x[j])==1) { // if particles are adjacent, add an edge between them
                V += p[i]*p[j]; // interaction between particles
    // find max flow (min cut)
    int mincut =,N+1);

    V -= 2*mincut;  // we remove (twice) the nagative contributions (see editorial)

    // let's  find the sign of each particle
    fill(up,up + N + 1,-1); // initially set all of them to -1 (T component, see editorial);
    dfs_up(0,G); // we do dfs starting from source to find all nodes that belong to the S component and assign them sign +1;

    // output the results
    cout << V << '\n';
    for(int i=1;i<=N;++i)
        cout << up[i] << " \n"[i==N];

void read_case(){
    cin >> H >> W >> N;
    for (int i = 1; i <= H; ++i) {
        for (int j = 1; j <= W; ++j)
            cin >> h[i][j];
    for (int i = 1; i <= N; ++i)
        cin >> y[i] >> x[i] >> p[i];
int main() {
    int t;cin >> t;
    while (t--) {
Tester's Solution
#include <iostream>
#include <string>
#include <cassert>
#include <cstdio>
#include <algorithm>
#include <vector>
#include <queue>
using namespace std;
const int inf = 1'000'000'000;
int H, W, n;
int h[1005][1005];
int px[205];
int py[205];
int price[205];
vector<pair<int, int> > v[205];
vector<int> rev[205];
int used[205];
int timer;
int depth[205];
bool positive[205];
void add_edge(int x, int y, int cap) {
    v[x].push_back({y, cap});
    v[y].push_back({x, cap});
    rev[y].push_back((int)rev[x].size() - 1);
int build_graph() {
    int result = 0;
    for (int i = 0; i <= n + 1; ++i) {
        used[i] = 0;
    timer = 0;
    for (int i = 1; i <= n; ++i) {
        int cur = price[i] * h[px[i]][py[i]];
        result += abs(cur);
        if (cur >= 0) add_edge(0, i, cur);
        else add_edge(i, n + 1, -cur);
    for (int i = 1; i <= n; ++i) {
        for (int j = i + 1; j <= n; ++j) {
            if (abs(px[i] - px[j]) + abs(py[i] - py[j]) == 1) {
                int cur = price[i] * price[j];
                add_edge(i, j, cur);
                result += cur;
    return result;
bool make_bfs() {
    for (int i = 0; i <= n + 1; ++i)
        depth[i] = -1;
    depth[0] = 0;
    queue<int> qu;
    while (!qu.empty()) {
        int x = qu.front();
        for (int i = 0; i < v[x].size(); ++i) if (v[x][i].second > 0) {
            int to = v[x][i].first;
            if (depth[to] == -1) {
                depth[to] = depth[x] + 1;
    return (depth[n + 1] > -1);
int push_flow(int x, int flow, int dest) {
    if (flow <= 0 || used[x] == timer)
        return 0;
    if (x == dest) return flow;
    used[x] = timer;
    for (int i = 0; i < v[x].size(); ++i) {
        int to = v[x][i].first;
        if (depth[to] > depth[x] && v[x][i].second > 0) {
            int fl = push_flow(to, min(flow, v[x][i].second), dest);
            if (fl > 0) {
                v[x][i].second -= fl;
                v[to][rev[x][i]].second += fl;
                return fl;
    return 0;
int find_min_cut() {
    int max_flow = 0;
    while (make_bfs()) {
        while (true) {
            int cur_flow = push_flow(0, inf, n + 1);
            if (cur_flow == 0) break;
            max_flow += cur_flow;
    return max_flow;
void find_positive(int x) {
    positive[x] = true;
    for (int i = 0; i < v[x].size(); ++i) {
        int to = v[x][i].first;
        if (v[x][i].second > 0 && !positive[to])
void check_answer(int answer) {
    int result = 0;
    for (int i = 1; i <= n; ++i) {
        int cur = price[i] * h[px[i]][py[i]];
        if (!positive[i]) cur *= -1;
        result += cur;
    for (int i = 1; i <= n; ++i) {
        for (int j = i + 1; j <= n; ++j) {
            if (abs(px[i] - px[j]) + abs(py[i] - py[j]) == 1) {
                int cur = price[i] * price[j];
                if (!positive[i]) cur *= -1;
                if (!positive[j]) cur *= -1;
                result += cur;
    if (result != answer) assert(false);
void solve(int test_id) {
    cin >> H >> W >> n;
    for (int i = 1; i <= H; ++i) {
        for (int j = 1; j <= W; ++j) {
            cin >> h[i][j];
    for (int i = 1; i <= n; ++i) {
        cin >> px[i] >> py[i];
        cin >> price[i];
    int full_price = build_graph();
    full_price -= 2 * find_min_cut();
    cout << full_price << '\n';
    for (int i = 0; i <= n + 1; ++i)
        positive[i] = false;
    for (int i = 1; i <= n; ++i) {
        if (i > 1)
            cout << " ";
        cout << (positive[i] ? 1 : -1);
    cout << '\n';
//    check_answer(full_price);
int main(int argc, const char * argv[]) {
    #ifdef __APPLE__
    ios_base::sync_with_stdio(false); cin.tie(0); cout.tie(0);
    int tests;
    cin >> tests;
    for (int i = 0; i < tests; ++i) {
    return 0;

It is so wonderful to see a competitive programming question as being related to a real world application that I haven’t heard about before.


I’m glad you liked it! The model is well known in physics, but this particular solution is not very widespread, even among physicists as far as i know; and certainly didn’t find any problems related to it in any programming contest.


Great problem, awesome work!!! :ok_hand:

1 Like

Keep bringing such problems, help us grow.

I have written this code for 10 point but it is only passing first two test case can any one help me to find where I am wrong

I didn’t read your code but I remember that the testcase you are failing deals with particles in corners of the grid. Check if you are correctly dealing with the bounds W=1000 and H=1000. If you still can’t find your bug let me know and I will try to find it

Awesome problem!
Though, with given constrains N <= 200 and 2D grid the problem can be solved with dynamic programming approach.

Alright, let’s go with the following dynamic dp[mask][i] – the maximum answer we can get on p[0..i) with mask - signs of previous p_j for j = 0..i-1
There are only two transitions from the current state:

  1. If we assign sign +1 for p_i we update state dp[2^i | mask][i + 1]
  2. If we assign sign -1 for p_i we update state dp[mask][i + 1]

How many states do we have? – O(N 2^N)
How fast we can compute transitions for the given state?

  • O(N) with trivial approach – for the given vertex we check all previous vertexes;
  • O(1) - let’s note that each vertex has no more than 4 adjacent vertexes and let’s precompute them beforehand.

So, now we can solve the problem in O(N 2^N) with DP-approach.
Awesome, right?
Well, wait a minute… Naive brute-force solution has the same complexity.

Here is the trick: we don’t need to store all previous vertexes in the mask.
Let’s find an order of vertexes i_1, i_2, ... , i_N such as K = max |i_a - i_b| is minimal for all pairs of adjacent i_a, i_b.
If we manage to find such order, then we can solve the problem in O(N 2^K) – by storing only last K vertexes for each state, right?

Sorry, I don’t know how to find the proper order.
But here is another observation: if we take all vertexes in BFS order, then K = O(\sqrt{N}).
Which is already sufficient to solve the problem.

You can check the implementation here:


Your solution was fantastic! When I saw it, I blame myself for not taking larger N, the truth is that when chose the constraints I couldn’t imagine an unintended solution that could pass with N = 200… my bad and congratulations to you!

Despite my initial blame to myself, then I appreciated the bright side of constraints that leave the door open to really good and creative alternative approaches.

I saw another approach that was also able to solve given the constraints, but I don’t remember the name of the coder.

This was such a nice problem. Kudos to the setter. I remember an old question which was basically a min-cut problem. It was exactly like this. I spotted the similarity but wasn’t able to cleanly convert it.

1 Like

I also tried to use dp in bfs layers but a lot of problems arise. I don’t know how you managed to make it passed but I couldn’t.

First , seeing your code, you are just runing a bfs from the first root the you find. But then, the maximum size of the queue can be 2 \times \sqrt N. So K can be roughly 27 in worst case.

Imagine something like this:


The number 1 is your root for BFS. The second layer will have 1 node , the third layer will have 3 nodes, then 5 and so on. If the maximum number of nodes is 200, then your last layer will have 27 nodes.

Try this input :

It will make you to use 2^27 * 197 of memory and your code will fail.

I managed to “solve” this situation doing this : Simulate the BFS with every node as a root and pick the one that minimize the maximum size of the BFS queue. I think it can be proof that with this approach the maximum size is at most 19 (or maybe 15), because I think K can be also bigger than the maximum size of the bfs queue.

But even with that I couldn’t make it passed the testcases.

By the way, hats off to you , @carre . Amazing problem and editorial

1 Like

I had a similar solution to korneev. BFS may not always find it, but I’d conjecture there always exists a variable ordering for which you need to keep a history of at most 14. You can find that variable ordering by finding an appropriate vector in R^2, taking the dot product of each variable’s position with that vector, and sorting by the dot product.

My argument isn’t entirely rigorous, but here goes:

Suppose there’s no such vector. Then, for every direction, we can find a line in the plane separating our variables into a “left” half on one side of the line and and “right” half on the other such that 15 or more “left” vertices are adjacent to at least one “right” vertex.

There ought to be an isodiametric inequality of sorts showing that the smallest figure in the lattice with this property is a 15x15 square, which has 225 cells.

Indeed, I had the similar concerns and a way to solve them in my mind.
Though, my initial solution had passed all the tests and I didn’t implement it.

how can I generate such a big testcase?

writing a little code that do it for you. Something like this will generate random values for your h’s:

for (int i=0;i<H;++i){
    for(int j=0;j<W;++j)
        cout << rand()%1000 << " ";
    cout << '\n';

Anyway, if you accept a little advice, in case you are learning how to generate 1E6 inputs may be you should focus in other kind of problems, this one won’t help you with basic steps of coding.

1 Like

thank you so much
but my code giving correct ans for this testcase as well

please, check if you are not changing W and H meaning. I think the error can be due to that…

if (y+1<h){p1=B1[x][y+1].f;}else {p1=0;}

there you compare the second item of B1 with h, but h should be the limit of the first item, according to this line:


If that’s not the case, in a couple of hours I can check it more seriusly

Hey, if you are interesting to learn about stress testing I do have generator / reference solution for this problem.
To try this out you need to:

  1. Clone my repository git clone
  2. Go to the repository root cd competivie-programming (Yep, sorry for the typo in repository name)
  3. Restore the problem from examples ./ restore examples/codechef/MAY20/NRWP
  4. Replace main.cpp in the repository root with your code.
  5. Do stress testing ./ stress

Things might not work as expected since I’m the only user of my setup.
Feel free to ping me in PM on GitHub or codeforces (james007) if you face any trouble.

ok, I checked it. Here is your code with some modifications:

  • as I mentioned before, there’s a swap between w and h when you check x+1 and y+1.
  • out of bounds in the initialization of B
  • you initialize B1.s to 0 but then 0 is a valid particle, you should initialize to a number that can’t be a particle (I used 204)

In case you are interested, the following code implements the brute force approach in a simple way, what often makes the code easier to debug:

thank you
and sorry this is very silly mistake