BUILDIT - Editorial

Problem Link

Practice

Contest

Author: Teja Vardhan Reddy

Tester: Misha Chorniy

Editorialist: Bhuvnesh Jain

Difficulty

MEDIUM-HARD

Prerequisites

Matrix Multiplication, Recurrences, Linearity of Expectation, Probabilities

Problem

You are given a circular which is equally divided into H parts. There are N building placed on the circle at some points, not necessarily distinct. You can start from any point and start shooting within a range of X. All building within the range collapse. You need to find the expected number of buildings which collapses when the probability of starting from any point is given by the following probability distribution:

a_i \text{is given for} 1 ≤ i ≤ K
a_i = \sum_{j=1}^{j=K} c_j\cdot a_{i-j} \quad \forall\,i:\;K \lt i \le h\,.

Explanation

Subtask 1: N ≤ 1000, H ≤ 10000, K ≤ 10

A simple brute force solution which first finds the probability of starting at each point and then simply iterates through each point and finds the number of buildings which are collapsed will work within the required constraints.

The time complexity of the above approach will be O(H * K + N * H) as X = H in the worst case. The first part is for the pre-computation and the next part if for finding the desired number of buildings which are collapsed. The space complexity is O(H).

Subtask 2: N ≤ 50000, H ≤ 1000000, K ≤ 10

All the further subtasks require the knowledge of linearity of expectation and indicator random variables.

Let us define the indicator random variable Y_i. It equals 1 if there is a building at position i else 0. Using this definition, the expected number of building which collapse is:

\text{Expected buildings} = \sum_{i=1}^{i=N} {a_i * \sum_{j=i}^{j=i+X} Y_j}

where the second sum is taken in a circular manner.

Using linearity of expectation (or rearrangement of terms), we can rewrite the above expression as:

\text{Expected buildings} = \sum_{i=1}^{i=N} {Y_i * \sum_{j=i-X}^{j=i} a[i]}

where the second sum is again taken in a circular manner.

Since Y_i is defined to be 1 for all points where buildings are present, we can see that outer sum runs for O(N) times in worst case. For the inner sum, we can just maintain a prefix sum for array a and thus find the required sum in O(1).

Using this approach, the time complexity becomes O(H * K + H + N) which is enough to pass this subtask. The space complexity is O(H).

You can see editorialist approach for {1}^{st} two subtasks below.

Full Solution Approaches

Author/Tester Solution: Matrix multiplcation for recurrence solving

The last approach was bad as it required us to calculate the probability of starting at each point which is not possible as H is very large. If carefully observed, the way future a_i are generated is given by a recurrence relation. Don’t confuse by seeing the convolution form in it. :frowning:

In case you don’t know how to solve recurrence using matrix multiplication, I suggest you go through this blog before.

In the last approach, we needed to calculate the prefix sum of some given range, for the recurrence relation, in an efficient manner. We can extend our matrix from K * K for normal recurrence to include one extra row which will calculate the prefix sum. Below the idea with a recurrence containing 3 terms. We can generalise it later.

Let recurrence be a_m = c_1 * a_{m-1} + c_2 * a_{m-2} + c_3 * a_{m-3}. As per blog the matrix is given by

\begin{bmatrix} c_1 & c_2 & c_3 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{bmatrix} \begin{bmatrix} a_{m-1} \\ a_{m-2} \\ a_{m-3} \end{bmatrix} = \begin{bmatrix} a_m \\ a_{m-1} \\ a_{m-2} \end{bmatrix}

To extend it to contain prefix sum as well, the matrix will look like:

\begin{bmatrix} c_1 & c_2 & c_3 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ c1 & c2 & c3 & 1 \end{bmatrix} \begin{bmatrix} a_{m-1} \\ a_{m-2} \\ a_{m-3} \\ \sum_{i=1}^{i=m-1} a_i \end{bmatrix} = \begin{bmatrix} a_m \\ a_{m-1} \\ a_{m-2} \\ \sum_{i=1}^{i=m} a_i \end{bmatrix}

Got the idea, we basically store the initial prefix sum as the last value and add the current value to the prefix sum to extend it further. This can be easily extended to higher order recurrences as well.

Using the above idea naively, we can solve the problem in O(N * K^3 * \log{H}), where for every building we use matrix multiplication to calculate the prefix sums. This is enough to pass the {3}^{rd} subtask but not the last one.

To solve the last subtask, we need to look at one important detail of matrix multiplication. Given matrices of sizes (a, b) and (b, c), the complexity for their multiplication is O(a * b * c). We are used to using square size matrices, so we say the complexity is always cubic. In recurrences, the last step involves multiplying the recurrent matrix, (K, K) with the base matrix, (K, 1) which takes O(K^2) complexity instead of O(K^3). This gives us a neat optimisation as follows:

{R}^{n} * B = R^{i_1} * ( \cdots * ({R}^{i_{\log{H}}} * B))

where R is the recurrent matrix ((K+1, K+1) matrix which is described above), B is the base matrix ((K+1, 1) matrix which is described above) and n = 2^{i_1} + 2^{i_2} + \cdots + 2^{i_{\log{H}}}.

An example of above equation is:

{R}^{11} * B = R^1 * (R^2 * (R^8 * B))

Note that the above step now needs O(K^2 * \log{H}) complexity, reducing a factor K from previous one. But we need to precompute the 2^w powers of the recurrent matrix. This can be done in O(K^3 * \log{H}).

Using the above ideas of precomputation and matrix multiplication, the complexity is O(K^3 \log{H} + N * K^2 * \log{H}). This will easily pass all the subtasks.

For more details, you can refer to the author’s or tester’s solution below.

Editorialist Solution: GP Sum of a matrix (Bad constant factor)

The ideas of precomputation and matrix multiplication described above hold in the editorialist solution too.

The solution uses the following idea for finding the prefix sums or recurrence:

a_1 + a_2 + \cdots + a_m = R^0 * B + R^1 * B + \cdots + R^{(m-1)} * B = (R^0 + R^1 + \cdots + R^{(m-1)}) * B

So, we need to find the GP (Geometrix progression) sum of a matrix. This is a known problem. If you don’t know about it, you can read similar problem here. But the only problem with this approach is the large constant factor involved. The matrix used to calculate the GP sum of a matrix is twice the size of given matrix, i.e. the matrix used will have size (2k, 2k). Hence, a constant factor of 2^2 = 4 is added to the complexity of the solution which is very hard to deal with. But with some neat observations like some parts of the matrix always retaining particular values, we can reduce the constant factor in the solution too.

I understand the above is a very brief idea of the solution, but in case you have any problem, you can read through the solution and ask any doubts you have in the comment section below.

The time complexity of the approach will be O(4 * K^3 * \log{H} + 2 * N * K^2 * \log{H}). The space complexity will be O(4 * K^2).

Feel free to share your approach, if it was somewhat different.

Extra Tip: Reduce constant factor in modular matrix multiplication

Below is the general code used to calculate matrix multiplication in the modular field:


  for i in [1, n]:
    for j in [1, n]:
      for k in [1, n]:
        c[i][j] = (c[i][j] + a[i][k] * b[k][j]) % mod

It can be easily seen that above code uses O(N^3) mod operations. It should be remembered that mod operations are costly operations as compared to simple operations like addition, subtraction, multiplication etc. We observe the following identity:

X \% \text{mod} = (X \% mod^2) \% \text{mod}

Above can be easily proved using X = q * mod + r. Using this fact, we can reduce the number of mod operations to O(N^2) in matrix multiplication. Below is the pseudo-code for it:


  mod2 = mod * mod
  for i in [1, n]:
    for j in [1, n]:
      c[i][j] = 0
      for k in [1, n]:
        c[i][j] += a[i][k] * b[k][j]    # Take care of overflows in interger multiplication.
        if c[i][j] >= mod2:
          c[i][j] -= mod2
      c[i][j] %= mod

Note that above trick reduces the running time by approximately 0.8-0.9 seconds. Though it is not required for the complete solution, knowing it can always help and might be useful somewhere else.

Time Complexity

O(K^3 \log{H} + N * K^2 * \log{H})

Space Complexity

O(K^3)

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution can be found here.

Editorialist’s solution for subtask 1 and 2 can be found here.

Editorialist’s solution can be found here.

1 Like

The Blog for matrix Multiplication is showing “Linearity of Expectation” @admin Please fix this.

We can also do this using generating functions.

If anyone wants to know my solution,tell me,i’ll be happy to share it.

Edit:

My solution was basically just about reducing this problem to rng,nothing more;

Basically we have to find \Sigma_{j{\epsilon}p} \Sigma_{j-x+1\leq{i}\leq{j}}a[i]

Now if we could find \Sigma_{j-x+1\leq{i}\leq{j}}a[i] fast enough ,it would solve our problem.

So lets consider a more generalized version,finding \Sigma_{p\leq{i}\leq{q}}a[i]

Now \Sigma_{p\leq{i}\leq{q}}a[i] = \Sigma_{p\leq{i}\leq{q}}\Sigma_{1\leq{j}\leq{k}}a[i-j]*c[j]

After solving this ,we get

\Sigma_{p\leq{i}\leq{q}}a[i]=\Sigma_{1\leq{j}\leq{k}}c[j]*(\Sigma_{p\leq{i}\leq{q}}a[i] + \Sigma_{p-j\leq{i}\leq{p-1}}a[i] -\Sigma_{q-j+1\leq{i}\leq{q}}a[i])

Solving this further ,we get

\Sigma_{p\leq{i}\leq{q}}a[i]=\frac{\Sigma_{1\leq{j}\leq{k}}c[j]*(\Sigma_{p-j\leq{i}\leq{p-1}}a[i] -\Sigma_{q-j+1\leq{i}\leq{q}}a[i])}{1-\Sigma_{1\leq{j}\leq{k}}c[j]}

lets say f(n)=\Sigma_{1\leq{j}\leq{k}}c[j]*{\Sigma_{n-j\leq{i}\leq{n-1}}a[i]}

Solving it a bit further ,we get

f(n)=\Sigma_{1\leq{t}\leq{k}}c[t]*(\Sigma_{1\leq{j}\leq{k}}c[j]*{\Sigma_{n-j-t\leq{i}\leq{n-1-t}}a[i]})

So indirectly we get,

f(n)=\Sigma_{1\leq{t}\leq{k}}c[t]*f(n-t)

So ,now we can find f(n) in O(k*logn*logk),as mentioned in RNG editorial.(first calculate the values for f(1) to f(k).

RNG editorial link: RNG - Editorial - tutorial - CodeChef Discuss

My solution link: CodeChef: Practical coding for everyone

2 Likes

The links open the Two Flower question and not this one…plzz fx this.

Solution using generating functions:

If you are new to generating function do read up the editorial for RNG Problem - CodeChef and also some online material on generating functions. If you can derive the generating function of fibonacci by yourself you are good to go. :slight_smile: Now any linear recurrence of form f(n)=c_1*f(n-1)+c_2*f(n-2)+...+c_k*f(n-k) will have a generating function of form \frac{P(x)}{Q(x)} where Q(x) is always 1-c_1*x-c_2*{x}^{2}-....-c_k*{x}^{k}.

You can even derive this. If you want the proof ask me and i will post. Now suppose you have a function f(x)=a_0+a_1*x+a_2*x^2+.... and you multiply with it a g(x)=1+x+x^2+... .

What do you get? You get some h(x)=a_0+(a_0+a_1)*x+(a_0+a_1+a_2)*x^2+....

So basically nth coefficient of h(x) is sum of first N terms of f(x). I will explain you later why you would need this if you havent figured it out yet. Also a generating function is distinct to a linear recurrence. And you can easily see from above that you can get back a linear recurrence using a generating function. And you know the generating function of our linear recurrence.

Also you know \frac{1}{(1-x)}= 1+x+{x}^{2}+...... (infinite GP series).

Now we also know the coefficients of the recurrence followed by the summation of terms of our recurrence. How? Just multiply the denominator Q(x) with (1-x) and then you get the denominator for the generating function of the sum. And you can easily generate the base cases. (Note you need 1 extra base case as the recurrence now has order k+1. ) Now having our tools ready lets see how can we reduce the problem

Okay as many of you have already realised taking out the denominator in common , what we want now is for some index i in the circle , the number of buildings from i to i+x-1 multiplied by a_i. Now see say we have x buildings. We can write x as 1+1+1+....+1 (x times).

Means each building that is present in this range contributes a_i to the sum. So what does each building contribute in total? Yes its just summation of all terms from a_{pi-x+1} to a_{pi}.

So we have reduced our problem to for a given n find summation of a_1+a_2+..+a_n. This is exactly what i get from our modified recurrence. Now for finding the nth term fast you can read up RNG editorial, Kevin has beautifully explained the process there. :smiley:

P.S- Please feel free thank @vijju123 for editing

6 Likes

I used the formula of GP Sum to calculate the answer:

(I - R) S_n = I - R^n

S_n = (I - R)^{-1} (I - R^n)

R is the first matrix in the editorial.

There are no countercase that (I - R) is singular, so I got accepted :stuck_out_tongue:

(got 100 points with precomputation of R^{2^i})

1 Like

I think that problem constraints was chosen in such way that (N * mod * mod) always fits in signed 64 bit integer. So mod2 comparison in matrix multiplication code could be eliminated.

1 Like

@stommydx can you explain how to get inverse of the matrix (I-R).

Updated. Thanks for info.

1 Like

Yeah I used the same. RNG came to rescue :stuck_out_tongue:

Yup… :slight_smile:

If anyone wants to know my solution,tell me,i'll be happy to share it.

He will stalk it without telling you here :p. I dont see why anyone will have problems with sharing solutions :slight_smile:

Because i am in bus currently,so cant write now ,hehe :slight_smile:

yes, if possible please explain the approach. I looked at your code but didn’t understand anything :frowning:

You can share the link xD

thanx @vijju123 :smiley:

1 Like

Anytime bro :slight_smile:

1 Like

Hi fellow duck! It is actually possible to calculate S_n in log(n) matrix multiplications using this. I used it in my solution.

1 Like

Yes, you are right. But just gave the general idea so that it works for all possible cases.

Main idea is too look at matrix inversion problem as n times solving Ax = b_{i} using Gaussian Elimination where b_{i} columns of identity matrix. Which gives O(n^4) complexity if you solve this subtasks independently, but if you concatenate A and identity matrix and solve these n tasks at once you reach O(n^3) complexity.