CLTNGL - Editorial

meooow · April 16, 2018, 12:58am

PROBLEM LINK:

Practice
Contest

Author and Editorialist: Soumik Sarkar
Tester: Avijit Agarwal

DIFFICULTY:

HARD

PREREQUISITES:

Mobius inversion

PROBLEM:

There is a grid with N equally spaced horizontal lines and M equally spaced vertical lines. What is the expected number of grid points on the perimeter of a randomly chosen non-degenerate triangle with vertices at grid points.

EXPLANATION:

The problem asks for the expected number of grid points on the perimeter of a randomly chosen triangle which has non-zero area. Let’s call these good triangles. As each good triangle is equally likely, the answer is equal to the sum of grid points on the perimeter of all good triangles divided by the number of good triangles.

ans = \frac{good_{sum}}{good_{cnt}}

The number of grid points on the line segment joining p_1=(x_1, y_1) and p_2=(x_2, y_2) (including endpoints) minus one is given by \gcd(|x_2-x_1|, |y_2 -y_1|). Let’s denote this as c(p_1, p_2) and assume we can get c(p_1, p_2) for any pair of points in constant time.
Clearly the number of grid points on the perimeter of a triangle with vertices at p_1, p_2 and p_3 is given by c(p_1, p_2) + c(p_2, p_3) + c(p_3, p_1).

We can iterate over all triplets of points (p_1, p_2, p_3), and for each triplet increment good_{sum} and good_{cnt} appropriately if the triplet forms a good triangle. This approach has complexity \mathcal{O}(n^3m^3).

Improvement 1

To calculate the sum and count of good triangles, we can instead subtract proper values of bad triangles from that of all triangles.

ans = \frac{good_{sum}}{good_{cnt}} = \frac{all_{sum} - bad_{sum}}{all_{cnt} - bad_{cnt}}

Click to view

Considering all possible triplets of points as vertices of triangles, we can say

\begin{aligned} all_{sum} &= \sum_{p_1} \sum_{p_2} \sum_{p_3} (c(p_1, p_2) + c(p_2, p_3) + c(p_3, p_1)) \\ &= 3mn \sum_{p_1} \sum_{p_2} c(p_1, p_2) \\ all_{cnt} &= \sum_{p_1} \sum_{p_2} \sum_{p_3} 1 = n^3m^3 \end{aligned}

Note that some triangles are counted more than once here, and we will have to take care to remove the effect of the correct number of bad triangles.

All bad triangles have zero-area, but it will help to consider 3 cases of bad triangles separately. badk triangles are triangles that have k distinct vertices.

\begin{aligned} bad1_{sum} &= 0 \\ bad1_{cnt} &= nm \end{aligned}

For bad1, p_1 = p_2 = p_3. There are exactly mn of these triangles, and there will be zero grid points on their perimeter, so we have the above results.

\begin{aligned} bad2_{sum} &= 3 \sum_{p_1} \sum_{p_2} 2 c(p_1, p_2) \\ bad2_{cnt} &= 3 (n^2m^2 - nm) \end{aligned}

In the above expression consider $p_1$as the point where 2 vertices coincide and p_2 as the 3rd vertex. The number of grid points on the perimeter of this triangle using our summation will equal 2c(p_1, p_2). In bad2_{cnt} the -nm term arises because n^2m^2 includes cases where p_1=p_2.
The terms above are multiplied by 3 because each such triangle was counted thrice in all (when p_1 = p_2 \ne p_3 or p_2 = p_3 \ne p_1 or p_1 = p_3 \ne p_2) .

\begin{aligned} bad3_{sum} &= 3 \sum_{p_1} \sum_{p_2} 2 c(p_1, p_2) (c(p_1, p_2) - 1)\\ bad3_{cnt} &= 3 \left(\sum_{p_1} \sum_{p_2} (c(p_1, p_2) - 1) + nm\right) \end{aligned}

Consider a line (p_1, p_2). Let the end points of this line be 2 vertices of a triangles. Then for every possible 3rd vertex on this line between p_1 and p_2 we have a bad triangle, and this number is (c(p_1, p_2) - 1). However when p_1 = p_2 this value is -1 and to adjust that we have a +nm term in bad3_{cnt}.
Each triangle under bad3 is counted 6 times in all (as 3! permuations of (p_1, p_2, p_3)), and here we count each triangle twice (as 2! permuations of (p_1, p_2)) so again we multiply by 3 to make up the amount.

If we add them up as

\begin{aligned} bad_{sum} &= bad1_{sum} + bad2_{sum} + bad3_{sum}\\ bad_{cnt} &= bad1_{cnt} + bad2_{cnt} + bad3_{cnt}\\ \end{aligned}

Then we have the answer as

ans = \frac{\displaystyle 3mn \sum_{p_1} \sum_{p_2} c(p_1, p_2) - 6 \sum_{p_1} \sum_{p_2} c(p_1, p_2)^2}{\displaystyle n^3m^3 - 3 \sum_{p_1} \sum_{p_2} c(p_1, p_2) - nm}

Now \displaystyle\sum_{p_1} \sum_{p_2} c(p_1, p_2) and \displaystyle\sum_{p_1} \sum_{p_2} c(p_1, p_2)^2 can be calculated by iterating over all pairs of points, so we are down to \mathcal{O}(n^2m^2).

Improvement 2

Let’s expand \displaystyle\sum_{p_1} \sum_{p_2} c(p_1, p_2) as \displaystyle\sum_{x_1=1}^n \sum_{y_1=1}^m \sum_{x_2=1}^n \sum_{y_2=1}^m \gcd(|x_2-x_1|, |y_2 -y_1|).
Considering the unique values of |x_2-x_1|=i and |y_2-y_1|=j, we can say

\begin{aligned} \sum_{p_1} \sum_{p_2} c(p_1, p_2) &=nm\gcd(0,0) + 2m\sum_{i=1}^{n-1}(n-i)\gcd(i, 0) + 2n\sum_{j=1}^{m-1}(m-j)\gcd(0, j) \\ &+ 4\sum_{i=1}^{n-1} \sum_{j=1}^{m-1} (n-i) (m-j)\gcd(i, j) \end{aligned}

A similar thing can be said for \displaystyle\sum_{p_1} \sum_{p_2} c(p_1, p_2)^2.

The first term in the expansion corresponds to cases of p_1=p_2 and can be calculated in constant time. The second and third terms deal with p_1 and p_2 on vertical and horizontal lines can be calculated in \mathcal{O}(n + m), or even in constant time if one uses standard summation formulae. The last term requires \mathcal{O}(nm).

Improvement 3

Here we focus on calculating \displaystyle f_1(n, m) = \sum_{i=1}^n \sum_{j=1}^m (u-i) (v-j)\gcd(i, j) and \displaystyle f_2(n, m) = \sum_{i=1}^n \sum_{j=1}^m (u-i) (v-j)\gcd(i, j)^2.

Before proceeding further, I request you to go through this blog and this blog on multiplicative functions and Mobius inversion by Nisiyama_Suzune. I will be using identical notation as in the blog and the process used here is the same as described in the blog.

Click to view

Assume without loss of generality n \le m, thus the gcd term never exceeds n.

\begin{aligned} f_1(n, m) &= \sum_{i=1}^n \sum_{j=1}^m (uv - uj - vi + ij)\gcd(i, j) \\ &= \sum_{g=1}^n g \sum_{i=1}^n \sum_{j=1}^m (uv -uj-vi + ij)[\gcd(i, j) = g] \\ \end{aligned}

Substituting i = ag and j = bg,

f_1(n, m) = \sum_{g=1}^n g \sum_{a=1}^{\lfloor n/g \rfloor} \sum_{b=1}^{\lfloor m/g \rfloor} (uv - ubg - vag + abg^2)[\gcd(a, b) = 1]

Applying Mobius inversion on [\gcd(a, b) = 1],

\begin{aligned} f_1(n, m) &= \sum_{g=1}^n g \sum_{a=1}^{\lfloor n/g \rfloor} \sum_{b=1}^{\lfloor m/g \rfloor} (uv - ubg - vag + abg^2) \sum_{d|\gcd(a,b)} \mu(d) \\ &= \sum_{g=1}^n g \sum_{a=1}^{\lfloor n/g \rfloor} \sum_{b=1}^{\lfloor m/g \rfloor} (uv - ubg - vag + abg^2) \sum_{d=1}^{\lfloor n/g \rfloor} [d|a][d|b] \mu(d) \\ &= \sum_{g=1}^n g \sum_{d=1}^{\lfloor n/g \rfloor} \mu(d) \left( uv \sum_{a=1}^{\lfloor n/g \rfloor} [d|a]\sum_{b=1}^{\lfloor m/g \rfloor} [d|b]- ug\sum_{a=1}^{\lfloor n/g \rfloor} [d|a]\sum_{b=1}^{\lfloor m/g \rfloor}[d|b]b - vg\sum_{a=1}^{\lfloor n/g \rfloor} [d|a]a\sum_{b=1}^{\lfloor m/g \rfloor}[d|b] + g^2\sum_{a=1}^{\lfloor n/g \rfloor}[d|a]a \sum_{b=1}^{\lfloor m/g \rfloor} [d|b]b \right) \\ \end{aligned}

Now \sum_{i=1}^n [d|i] = \lfloor\frac{n}{d}\rfloor and \sum_{i=1}^n [d|i]i = ds(\lfloor\frac{n}{d}\rfloor) where s(n) = \frac{n(n+1)}{2}.

f_1(n, m) = \sum_{g=1}^n g \sum_{d=1}^{\lfloor n/g \rfloor} \mu(d) \left(uv \lfloor \frac{n}{gd}\rfloor \lfloor \frac{m}{gd}\rfloor -ugd \lfloor \frac{n}{gd}\rfloor s(\lfloor \frac{m}{gd}\rfloor)-vgd\ s( \lfloor \frac{n}{gd}\rfloor) \lfloor \frac{m}{gd}\rfloor + g^2d^2s(\lfloor \frac{n}{gd}\rfloor)s(\lfloor \frac{m}{gd}\rfloor) \right)

Let k = gd, then,

f_1(n, m) = \sum_{k=1}^n \left(uv \lfloor \frac{n}{k}\rfloor \lfloor \frac{m}{k}\rfloor -uk \lfloor \frac{n}{k}\rfloor s(\lfloor \frac{m}{k}\rfloor)-vk\ s( \lfloor \frac{n}{k}\rfloor) \lfloor \frac{m}{k}\rfloor + k^2s(\lfloor \frac{n}{k}\rfloor)s(\lfloor \frac{m}{k}\rfloor) \right) \sum_{d|k}\mu(d) \frac{k}{d}

Similarly we can say,

f_2(n, m) = \sum_{k=1}^n \left(uv \lfloor \frac{n}{k}\rfloor \lfloor \frac{m}{k}\rfloor -uk \lfloor \frac{n}{k}\rfloor s(\lfloor \frac{m}{k}\rfloor)-vk\ s( \lfloor \frac{n}{k}\rfloor) \lfloor \frac{m}{k}\rfloor + k^2s(\lfloor \frac{n}{k}\rfloor)s(\lfloor \frac{m}{k}\rfloor) \right) \sum_{d|k}\mu(d) \frac{k^2}{d^2}

The function \sum_{d|k}\mu(d) \frac{k}{d} is a multiplicative function which can be recognized as Euler’s totient function \varphi. For a prime p, \varphi(p^k) = p^k - p^{k-1}.
\sum_{d|k}\mu(d) \frac{k^2}{d^2} is another multiplicative function, let’s denote it by \varphi_2. Clearly \varphi_2(p^k) = p^{2k} - p^{2k-2}.
Both functions can be computed using a linear sieve as described in the above mentioned blog post.

The complexity of the solution is now \mathcal{O}(n + m) which fits in the given time limit.

AUTHOR’S AND TESTER’S SOLUTION:

Author’s solution can be found here
Tester’s solution can be found here.