BMASTER - Editorial

Author: Roman Rubanenko
Tester: Anton Lunyov
Editorialist: Anton Lunyov

MEDIUM

PROBLEM:

Getting rid of the storyline we can formulate the problem as follows. You are given some starting N × M grid G0 of zeros and ones and some function f on the set of such grids. Namely, f(G) is the grid obtaining from the grid G after the Loda’s move. Create a sequence of grids Gn by the formula Gn = f(Gn − 1) for n ≥ 1. The task is to find CNT(G0) + CNT(G1) + … + CNT(GT) for several values of T, where CNT(G) is the number of ones in the grid G.

QUICK EXPLANATION:

The main thing here is to believe that in the sequence {Gn} repetition occurs relatively early. That is, the first value of p such that Gp = Gq for some q < p will be relatively small so that the simple brute force will allow to find it in a reasonable time. Since Gn = f(Gn − 1) it would mean the periodicity of the sequence {Gn}. Namely we will have

**Gn + p = Gn + q** for **n ≥ 0**. It allows to answer the queries in **O(1)** time after precomputing the sums **CNT(G0) + CNT(G1) + ... + CNT(Gn)** for **n ≤ p**.

EXPLANATION:

For those who don’t really get this a bit advanced formulation of the problem, let’s consider an example from the problem statement. The sequence {Gn} in this case starts as follows:

 G0 = 100 000 000 ; G1 = 111 000 000 ; G2 = 101 111 111 ; G3 = 011 101 010 .

I hope it should be clear now what this math thing Gn = f(Gn − 1) means.

Strictly speaking we don’t have a formal proof why the first repetition occurs relatively early. The intuitive reasoning is that the function f(G) transforms the grid in some random fashion, so we can approximately consider the sequence Gn as a sequence of random grids of zeros and ones. There are S = 2N * M such grids in all. According to the birthday paradox among 3.5 * sqrt(S) random grids repetition occurs with probability about 99.8%. For sizes 2 × N, N × 2, 1 × N and N × 1 sequence Gn usually has lower index of the first repetition. So the toughest sizes under the given constraints are 3 × 11 and 11 × 3. Trying a lot of initial data (Sx, Sy, P1, P2) for these sizes with different random distributions the hardest test we were able to create has the first repetition at the position near 324000. Note that S = 233 in this case, so 3.5 * sqrt(S) is also near 324000. It means that our intuitive reasoning seems to be correct.

Now let’s discuss the implementation of this solution.

For convenience we will numerate rows and columns of the grid starting from 0. We can represent grids as numbers having at most N * M bits in binary representation. Namely, if the cell (i, j) of the grid G is G[i][j] where 0 ≤ i < N and 0 ≤ j < M, then the corresponding number is the sum of G[i][j] * 2M * i + j over all i and j. So we can now consider sequence of grids {Gn} as a sequence of numbers. We call such numbers bit-masks.

To find the first repetition in this sequence we need some data structure that allows to check quickly after calculating each Gn whether we have already such number in the sequence and in what position. The most efficient in this situation will be Hash Table that allows to answer this question in O(1) time in average with high probability. It is implemented in both author’s and tester’s solutions. Alternatively you can use any fast associative array data structure in the language of your choice. For example in C++ you can use STL container map. It allows to answer this question in O(log n) where n is the index of the current element of the sequence.

The time limit was moderate so to calculate f(G) for some grid G represented as a number you can simply restore the grid by the number, make directly all calculations described in the problem statement and calculate the number for the new grid. There is only one thing you should additionally notice. The new grid should be created in the new variable since otherwise you can’t make appearing of new images for all cells simultaneously. All other bits of making move can be performed directly as described in the problem statement. In this way f(G) can be calculated in O(N * M * (N + M)) with quite large constant. Probably not all implementations of f(G) could pass the time limit since in our intended solution this step is made in O(N * M) time. But only a few tries to optimize the direct implementation will give you a solution that can pass the time limit. Well, in fact we would like all such solutions to fail but the slowness of Pyramid (archaic SPOJ server where all your solutions are run) not allow us to set higher constraints.

To get the O(N * M) algorithm for calculating f(G) you need to do just three things.

1. For each cell (i, j) and the number r from 0 to 5 precompute the bit-masks of cells which appear from the cell (i, j) if F(i, j) mod 6 = r.

2. For each cell (i, j) precompute the bit-mask of the cells in the cross of the cell (i, j).

3. Precompute bitCount (the number of ones in binary) for numbers up to 217. This allows to find bitCount for numbers up to 234 as a sum of two stored values for the first half of the number and for second half of the number.

Now the move can be made in a simple O(N * M) loop. See tester’s solution as a reference. Namely, F(i, j) is just bitCount(grid `and` cross[i][j]). So to make move we need to iterate over all cells check whether we have image in this cell and `xor` the resulting grid with corresponding bit-mask of new cells which appear from this cell.

Another speed up that can be made is the precomputing residues modulo 6 for values up to 34. It seems ridiculous at the first sight but the modulo operation even modulo 6 considerably slower than array look-up. So this hack speeds up the solution in about 15%. It is used in author’s solution.

Now assume that we find values p and q such that Gp = Gq and 0 ≤ q < p. As was mentioned in the QUICK EXPLANATION section let’s store the sums A[0] + A[1] + … + A[n] for 0 ≤ n ≤ p in some array sum[], where we set for convenience A[k] = CNT(Gk).

Now assume that we have some value of T (possibly large) and we need to calculate the sum Sum(T) = A[0] + A[1] + … + A[T]. Clearly, if T ≤ p we can simply use the stored value sum[T]. If T > p we will use the periodicity of the sequence {A[n]} which follows from the periodicity of the sequence {Gn}. For this we introduce the period w = p − q and write T as T = r + w + … + w, where q ≤ r < p. Let there be L numbers w in this representation. The sum of the first r + 1 summands in Sum(T) is equal to the stored value sum[r]. The sum of each of the following groups of length w is the same due to the periodicity and equals to the sum A[q + 1] + … + A[p] = sum[p] − sum[q]. Hence

**Sum(T) = sum[r] + L * (sum[p] − sum[q])**.

In fact, L equals to floor((T − q) / w) and r = T − L * w.

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.
Tester’s solution can be found here.

RELATED PROBLEMS:

SPOJ - 3831. Lubenica - LUBEN

3 Likes

I read the quick explanatory and did the same thing as said in the problem … Still I got WA. Can someone look into my code and point out the bug. The code has very good pneumonics… So I am sure you wont have troubles reading and understanding it . CodeChef: Practical coding for everyone

Try this test
2 17 2
2 7
1 2
2 1
1 1
2 2
2 1
2 2
1
2
15
38
18
44
So you make incorrect transformation even at the first move.
Try to check again the correctness of hard-coded parts corresponding to D1, D2.

1 Like

Thank You . Just a silly bug. Forgot to multiply P1[] and P2[] values to the directions. Feeling embarrassed about it .Wished I could have spotted in during the contest, so I could have thought more about it when I got a TLE. Anyways O(n*m) ? because my O(n*m*max(n,m)) fails in time

Well, maybe for Java it is harder to cram the O(NM(N+M)) solution. But in C++ I saw a lot of AC. Check it out for example this very straightforward and clear one
http://www.codechef.com/viewsolution/1405920

I’ve also submitted my cleaner solution than mentioned in the editorial. Maybe you can learn better from it how to do things in O(N*M) time:
http://www.codechef.com/viewsolution/1479283

Once I finish remaining editorials I will add explanatory comments to it.

1 Like

i implemented the solution using O(mn(m+n)) grid generation in java, but it gets TLE. How can i optimise it more? Here’s my solution:
http://www.codechef.com/viewsolution/1479508

i implemented using same algorithm but still tle admin please take a look…
http://www.codechef.com/viewsolution/1479688

I tried to implement the storing of different grid patterns using `bitset` in C++. For every pattern i find the equivalent number and store it in a `set`. Since a `set` stores unique values, whenever the count before insertion is same as count after insertion, I stop computing further and then solve each test case. But the problem is I still get a TLE… PLz check my sol. here and tell me where should i optimize?? thanq.

Hi,

I have been banging my head against the wall (WA for blade master from October long contest ). Can anyone please help me find the case where this fails.http://www.codechef.com/viewsolution/1483356.

Regards
cptCharisma

@cptcharisma
Here is your solution that get AC with only one character changed:
http://www.codechef.com/viewsolution/1483926
It will be a good exercise for you to find this character
When you find it you will understand that the way you calculate save array is bad.
Copy-paste is very bad practice during writing the code.
See tester’s of author’s solution for a good way.

1 Like

TO ALL: The editorial was updated and contains now a detailed explanation of how to make move in O(N * M) time. So please don’t ask how to optimize your TLEd O(N * M * (N+M)) solution. You can always optimize it to O(N*M) solution and get AC

@anton_lunyov:
Thank you so much.
I cant believe I let this bug through. Thanx a lot again for pointing it out. diff helped me find that char btw.

@anton_lunyov- i used the same trick as given in Quick Explanation but its giving Wrong Answer… my solution is CodeChef: Practical coding for everyone. Can u tell me whats the error???

@red_coder
You should use `1LL<<n` instead of `1<<n`. Since `1<<n` equals `0` for `n>=31`.
Fixing this in two places I get AC with your code:
http://www.codechef.com/viewsolution/1492246

Also why you are using so many copy-paste? It is really bad practice.
I rewrote your code a bit to show you how it would better to write it:
http://www.codechef.com/viewsolution/1492259

@anton_lunyov- thanks dude… specially for that copy-paste thing… i would avoid that from next time