SMPAINT - Editorial

shangjingbo · January 1, 2014, 7:10pm

PROBLEM LINK:

Practice
Contest

Author: Tuan Anh Tran Dang
Tester: Gerald Agapov
Editorialist: Jingbo Shang

DIFFICULTY:

Challenge

PREREQUISITES:

Greedy, Dynamic Programming, Heuristic, Mixed Methods

PROBLEM:

Given a N*N matrix with colors ranging from 0 to 100, find the minimum number of painting operations needed to paint a initially all zero matrix to this given matrix. Each painting operation allows you to paint a rectangle area into a color.

EXPLANATION:

This is a challenge problem. The most important thing in solving challenge problem is to find some approximation methods or the plan to use the randomness of the cases, because the challenge problems are always NP-hard or NP-Complete. They are hard to be perfectly solved in polynomial times like other nine problems. Also, it is important to write a local test system to help you find the best strategy (almost all top ranked guys wrote their own test codes).

So let’s look at the test generation plan first. “The test data is generated randomly by simulating the process of making an image in SMPaint using no more than 500 Paint operations”, from this sentence, it is worth noting that the best answer should be no more than 500.

And then, let’s discuss some ideas and methods.

In the output description, there is a limit of output steps. It is easy to fit the constraints of N^2 steps. The most naive one, paints the matrix entry by entry.

Then, the first key idea is that reverse the procedure: choose the last painted rectangle area, and remove it by treating them as any color you want. It is straightforward to get a greedy method then:

while (there is any mismatch)
      find the largest rectangle area with the same color
      // could contains some painted ones
      paint that area in one step

The finding procedure can be done by dynamic programming, similar to the largest empty rectangle problems (e.g. SPOJ 277 CTGAME). It could be done in O(N^2) time. Unfortunately, N is as large as 1,000. Therefore, if the answer is a little large, this greedy algorithm will get TLE. But, we can still apply this algorithm when N is small, such as N <= 600. This greedy algorithm and similar ones were a part of many submissions, such as the 4-th ranked submission from xyz111. It contains a lot of interesting specialized methods for smaller N. From here, we can see that the mixed methods are helpful to the challenge problems.

Since N is a little large, we should find some other efficient good solutions for larger cases. The most intuitive method is to find some good Heuristic Function (not necessary to find some proved/well defined one, just follow your intuition or local experimental results) to define how good the current matrix is. After we defined a such function, we could choose the “best” painting each steps, and finally reach the target matrix.

There are two problems though:

1. Too many choices. Too slow during choosing.
2. The "best" choice looks stupid sometimes.

For the first problem, we can design some simpler rules to maintain a small amount of candidate choices. Therefore, the choosing procedure will be cheaper.

For the second one, we can try some different functions together. For example, we have two functions F and G. We can get two possible solutions by them, and choose the best one. Also, we can randomly choose the trust one function each round (we can try random top-3 “best” choices during each round, too). Repeat this procedure multiple times and output the best solution will yield a good score. In the best submission from Mugurel Ionut Andreica, it looks like maintaining some (not many/all) “good” states and expanding them heuristically.

Of course, further discussions are welcome.

samjay · January 2, 2014, 5:43am

FYI, my approach which became 2nd:

Note that as mentioned in editorial if the paintings are done in reverse order, then each pixel may be colored infinite times, but still only the first time colored remains

So we only paint a rectangle if there are no pixels left that have another color or are unpainted. My algorithm (part 1):

Initially create a list L of “shapes” per color in the painting. (shape defined by U,D,L,R and color and each pixel knows in what shape it is in, so this is for each shape the minimum size to fit all pixels of the same color)
Pick a random shape S of the list L while it’s not empty
mark each other shape that overlaps with S as a dependent shape of S.
In case the other shape already depends on S (can occur when other shape was already handled in a previous iteration) => Split the overlapping shape into new shapes and shrink the existing shape

Split: put all pixels of S with the color of the overlapping shape into a new shape which is added to L

Shrink: remove the pixels of the existing overlapping shape, and try again to minimize its size

Note that the newly created shapes will be the dependent shapes of S (and with some bad luck again causing S to split in a later iteration)
If S has no dependent shapes, it can be added to the list with the shapes that are ready to paint P

The result of this algorithm is a directed acyclic graph

The 2nd part of the algorithm is painting the Shapes from P in reverse order, and each time notifying the shapes that are depending on the drawn shape.

E.g. In this example P = [2, 9, 10]

draw 10 => notify 3, 11 (both still have dependents left)
draw 9 => notify 8, 11 (8 has no dependents left and can be added to P)
draw 2 => notify 11 (11 can be added to P)
…
draw 7

A possible actual result of these shapes would be 7, 5, 11, 3, 8, 2, 9, 10
where each number stands for a paint operation

Remarks:

By picking the next shape random (in the first algorithm), the result can differ each run. So we can do this entire algorithm for as long time permits. (1000x1000 was about 9 times for 1 testcase in my final algorithm)
the Splitting can be done in different ways, but make sure that that this is not going infinite for two exact overlapping shapes (my best solutions use Flood fill)
Probably how my algorithm is explained, could be done recursive in c++. But for clearance and debugging, this way made it a lot easier for me. Also in optimization phase.
At the end I also added check that I don’t do a paint operation without effect as starting with a paint operation with color 0 (which improved much more than I expected)
When I came up with this idea, it looked much easier to implement than it appeared to be =)

mjbpl · January 2, 2014, 1:35pm

My approach, not very good but easy to implment (CodeChef: Practical coding for everyone) was

Repeat for 1 to 100
Find bounding box for this number in matrix
If bounding box contain any smaller number, break the bounding box into 4 parts and retry.

For ex: to get

Step 1: 1 can be drawn as

Because all number we are over-painting are larger, and they would later get corrected.

Step 2: After we encounter a smaller color (already used one) in 2’s bounding box, we break it into smaller boxes. So 2 would be drawn in

Step 3: Similarly 3’s bounding box is contain 2’s in it. Splitting it we get

By the time we finish 100 steps, we would get our final answer.
Unfortunately I could not get time to debug and correct my solution. May be someday I would debug and upload.

mugurelionut · January 3, 2014, 8:36pm

Here are the main ideas of my solution (which got the best score):

1000 x 1000 is too large to be able to try too many approaches, so my first step was to compress the matrix. Basically I found all the colored maximal rectangles and I only “kept” rows and columns which were the top row / leftmost column or the row after the bottom-most row / column after the rightmost column of a maximal rectangle. This way I got to matrices of about 100x100 from 1000x1000 (in the worst case, on the official tests, I got about 125 rows/columns). In the compressed matrix each entry stands for a rectangular block of the original matrix in which all the entries have the same color. Thus, I worked with a matrix which was about 10 times smaller in each dimension => about 100 times smaller overall. Now this was a good place to start trying things
I also considered painting the rectangles from the top to the bottom. In a given state (with some cells already covered and, thus, they can be used as wildcards for any other color) all the relevant candidates are the maximal colored rectangles from that state.
From each state I sorted the candidates according to a “move score” (I used a combination of ratio of the number of colored cells to the total number of colored cells in that connected component, plus some weighted number of “removed” corners)
I also had a score for each state, which consisted in the total number of corners of that state. A corner is a point in the grid between rows and columns which has an odd number of cells colored the same around it (or 2 cells colored the same in opposite directions). In order to find these corners I used the union of maximal colored rectangles (slightly updated, due to the way I computed them). It turns out that this state score was close to 3 times the best number of moves I could get for the matrix, so it was a good guiding score (I also tried other state scores, but they were not too well correlated to the actual number of moves, thus they didn’t improve my solution).
I used some standard state expansion procedure, with pruning candidates and scoring states and always continuing with the most promising state.

I also tried randomized greedy algorithms along the way, but they were not precise enough as the state expansion procedure so they couldn’t find good enough solutions within the time limit. However, had I not been able to compress the matrix initially, using expanding states would have been out of the question, as the algorithm would have been too slow.

shangjingbo · January 2, 2014, 10:17am

glad to hear your great algorithm!

nitinj · January 2, 2014, 11:53am

Amazing!..

mjbpl · January 2, 2014, 1:36pm

Good work.

cyberax · January 2, 2014, 6:16pm

thank you very much !

samjay · January 4, 2014, 8:25am

Was thinking too that there wasn’t much to do for a 1000x1000 matrix. ++ for the compression idea. Would have improved my score a lot too I guess.