MAXRECT - Editorial

PROBLEM LINK:

Practice
Contest

Author: Nikhil Garg
Tester: Anton Lunyov
Editorialist: Anton Lunyov

DIFFICULTY:

CHALLENGE

PREREQUISITES:

None

PROBLEM:

You are given a matrix M of the size H × W with integer elements. You need to find some its sub-matrix with large sum. You score depends on the sum you find. In order to get AC your sum should be positive. It is guaranteed that such sub-matrix exists. By sub-matrix we mean some matrix that can be obtained from the given one by removing some of its rows (maybe none, but no all) and removing some of its columns (again maybe none, but not all).

EXPLANATION:

Since sub-matrix with positive sum exists then there exists at least one positive element in the matrix. Hence in order to get AC you just need to find r and c such that M[r][c] > 0 and print
1 1
r c

Since we have some tests with very low number of positive elements we suggest to start any arbitrary smart solution with initializing the answer by this simplest correct sub-matrix.

The good idea how some smart solution can be started is the following. Assume that the set of rows of the sub-matrix is fixed. Than the optimal set of columns will be exactly those columns for which the sum of elements in this column in the selected rows is positive. The same can be applied when the set of columns is fixed.

Now several strategies can be used using this idea:

  1. Choose a random subset of rows (or columns) and find the optimal subset of columns (or rows). Do this multiple times and choose the best. See author’s solution as a reference.

  2. Choose a random subset of rows and find the optimal subset of columns for it. Then find the optimal rows for these columns. Then find the optimal columns for these rows and so on till we stop making the progress. See tester’s solution as a reference.

Also different types of hill climbing are possible. Start with choosing a set of rows and a set of columns. Then at any stage either add new row, delete a row, add new column or delete column so that the sum increases. Keep doing this till time permits. Simulated Annealing or beam search could also be done using this neighborhood criterion.

As usually all contestants are welcome to describe their strategies.

The natural question arises. Why we can’t find optimal answer in reasonable time?
The reason is that this problem is equivalent to Maximum Edge Weight BiClique Problem (MEWBCP), which is known to be NP-complete. What does it mean? Consider a complete bipartite graph G having W vertexes in the first part and H vertexes in the second part. Let edge between r-th vertex from the first part and c-th vertex from the second part has the weight M[r][c]. Then it is clear that any sub-matrix of M corresponds to some biclique (complete bipartite sub-graph) of G and the sum of this sub-matrix is the sum of edges in this biclique.

The problem is NP-complete since we can reduce the classical NP-complete maximum clique problem to it. The reduction is quite simple and can be found here.

Finally, we mention that tabu search can be applied in this problem. You can read about it in this paper. Here authors consider MEWCP (Maximum Edge Weight Clique Problem). But it is clear that MEWBCP is partial case of MEWCP when weights of edges inside each part of the graph are zero. The authors formulate MEWCP as unconstrained binary quadratic program (UQP). In our case it takes the form.

maximize sum(M[r][c] * x[r] * y[c] : 0 ≤ r < W, 0 ≤ y < H)
subject to x[0], x[1], …, x[W − 1], y[0], y[1], …, y[H − 1] are in {0, 1}

Note that the variables here have very simple meaning:
x[r] = 1 if and only if the r-th row is chosen;
y[c] = 1 if and only if the c-th column is chosen.

The authors of the paper wrote that they can solve optimally the instances of MEWCP for graphs with thousands vertices using tabu search for this UQP. This approach described in detail in this paper. You can learn more about tabu search here (home-page of its creator).

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.
But it currently gets WA since the init by 1 × 1 sub-matrix with positive element is not made.

Tester’s solution can be found here.
It gets the score 3084.54887261952. Note, that the winner’s score is 3096.0084244532. So, even such relatively simple solution gets the high score. That is why we have such close scores among the top contestants.

RELATED PROBLEMS:

GARDENSQ
Timus - 1146. Maximum Sum

6 Likes

My solution wasn’t so good as described approach, simply I didn’t know how to check time in C (on CodeChef), but from now I’ll use testers’s solution as reference :slight_smile:

I’ll describe the approach for rows, but in contest I did everything twice - also for columns.

My simple approach was to select max row where max for row is sum of all positive numbers. This got score 74.7646492882 (0.024pts).

Later I improved my solution so, that this max row in fact fixed (or selected) it’s positive values as selected columns and for each row I calculated sum using those fixed columns, if sum for row was positive I added it to result. This got score 2670.6785499514 (0.863pts).

Last improvement I implemented emerged from question: “why to use only max score?”. But how to select the correct one? I simply calculated sum for each row, ordered rows by this value a tried from max… I had constant in which I told how many to use (I wanted to prevent TLE, because this approach is O(N^3) → for N=300 54M operations in 2 seconds, for columns too). Finnaly I found that there is not problem with limit, so my O(N^3) approach got score 2738.06110911312 (0.884pts).

All my submissions - CodeChef: Practical coding for everyone

3 Likes

I used two strategies for every test case and printed the result from that strategy which gave larger sum for that particular test case. I got 0.992 points (execution time 23-24 seconds approx)

Strategy 1) I calculated the maximum contiguous sub-matrix (can be calculated in O(hXhXw) time…removed the rows with negative sum…added the rows with positive sum…removed the columns with negative sum…added the columns with positive sum.

Strategy 2) This was a brute force solution.

sum = 0;
Do for each column
{
     Mark all rows with a positive element in this column;
     Mark all columns with a positive sum in marked rows;
     temp=sum of marked rows and columns;
     sum = max(sum, temp);
}
Print the set of rows and columns with maximum sum.

Repeate this strategy for rows also.

Here is my solution.

2 Likes

I managed to get 3057.02076214532 points with this solution:

read h
read w
m[w]
temp[w]
rows<>
columns<>
prevsum = 0 

for i in 0 to w-1
   read m[i]
   if m[i] > 0
      prevsum += m[i]

if prevsum > 0 
   add 0 to rows

for i in 1 to h-1
{
   totalsum = 0
   rowsum = 0

   for j in 0 to w-1 
      read temp[j]
      if temp[j] > 0
         rowsum += 1
      if m[j] + temp[j] > 0
         totalsum += 1

   if totalsum > prevsum and totalsum > rowsum
      for j in 0 to w-1
         m[j] += temp[j]
      prevsum = totalsum
      add i to rows

   else if rowsum > prevsum
      for j in 0 to w-1
         m[j] = temp[j] 
      prevsum = rowsum
      empty rows
      add i to rows
}

for i in 0 to w-1
   if m[i]>0 
      add i to columns

print rows size
print columns size
print rows
print columns 

This way I could calculate while reading, if you have any suggestions I would highly appreciate them. I didn’t use vectors for saving the rows and columns in my submission, I thought it would be a little expensive. I used boolean arrays instead.

This is my java solution.