EXGCD - Editorial

cook29
easy-medium
editorial
exgcd
number-theory

#1

Problem Link:

Practice
Contest

Difficulty:

Easy-Medium

Pre-requisites:

Number Theory: Principle of Inclusion-Exclusion, Moebius Function, Euler’s Totient Function, Modular Arithmetic

Problem:

You are given K intervals [A0, B0], [A1, B1], …, [AK-1, BK-1]. From each interval i, you randomly pick a number Xi. What is the expected value of the GCD of the picked numbers (X0, X1, …, XK-1)? If the answer is expressed as a fraction P/Q, then output an integer N in [0, MOD-1] (MOD = 10^9+7) such that P + Q*N is divisible by MOD.

Preliminary Definitions / Lemmas etc:

You may skip whichever you’re familiar with.

Euler’s phi function. Phi(N) = |{n: n <= N and n co-prime to N, i.e. gcd(n, N) = 1}|

Formula for phi(N):
if N = ∏(piai) (after prime factorization), then
phi(N) = ∏ piai-1*(pi-1)

Lemma: ∑d|N (phi(d)) = N for all N.

A simple proof of the above:

Consider N fractions : 1/N, 2/N, …, N/N.

Rewrite these in reduced form: 1/N, …, 1/1.

Now, for each d|N, there will be phi(d) of the above fractions having denominator = d.

Hence the lemma follows, by counting in two ways, the same N fractions.

Moebius function mu(N):

mu(N) = 0 if p2 | N for some prime p (i.e. if N is not “square-free”)

else (-1)^(#primes dividing N) otherwise.

Hence, both phi and mu can be calculated efficiently using the Sieve of Eratosthenes.

Lemma: Moebius inversion formula:

d|N (d * mu(N/d)) = phi(N).

Lemma: gcd(x1, x2, x3, …, xn) = g <=> gcd(x1/g, x2/g, x3/g, …, xn/g) = 1.

This can be seen by taking prime factorizations of xi’s and comparing LHS and RHS of both equations.

Inverse modulo prime:

Given a prime number ‘M’, to every number x in the range [1, M-1], you have a unique number y, denoted by x-1, such that x * y = 1 (mod M).

Given prime M, and value x, x-1 can be efficiently found using either of the following methods:

  1. Euclidean algorithm: given x and M, find y and n such that x * y + M * n = 1. Now, by taking modulo M on both sides, we get that x-1 = y (mod M)
  2. Fermat’s little theorem: for any x not divisible by prime M, xM-1 = 1 (mod M). This shows that xM-2 = x-1 (mod M). Further xM-2 is calculated efficiently by repeated squaring in log(M) time.

Quick Explanation:

We try to find P = sum of gcd(x0, x1, …, xK-1) over all possible values of xi.

Also find Q = number of tuples (x0, x1, …, xK-1) = product(Bi - (Ai - 1)).

With such P and Q, output N = -P * Q-1 (mod 10^9+7).

The question boils down to finding P. Using principle of inclusion-exclusion (and concept of moebius function), get that P = ∑R, m, mR <= min(Bi)(R * mu(m) * ∏0<=i < K (floor(B/(m * R)) - floor((A*-1)/(m * R))

This is equivalent (using moebius inversion formula) to P = ∑Y <= min(B_i) (phi(Y) * ∏0<=i < K (floor(B*/Y) - floor((A*-1)/Y)).

Detailed Explanation:

Consider a K-dimensional array (lets call it GCD) where the value in the (x0, x1, x2, …, xK-1)'th cell is gcd(x0, x1, x2, …, xK-1). We are interested in finding the expected value of GCD[x0][x1]…[xK-1] over all possible values of x0, x1, …, xK-1. Since all xi’s are picked randomly, the expected value is equal to the arithmetic mean. Finally, this question is equivalent to finding the sum of the values in this K-dimensional GCD array, since the mean is then just dividing this sum by the number of elements.

In fact, looking at P/Q, we can get P = sum of elements, and Q = number of elements. Let MOD = 10^9 + 7. We need to find N such that P + Q * N is divisible by MOD. In other words, P + Q * N = 0 (mod MOD). Since MOD is prime, we can find “Q-1” modulo MOD, which gives us that N = -P * Q-1 (mod MOD). Hence, finding P = sum of elements in GCD array modulo MOD, and Q = number of elements = ∏ (Bi - (Ai-1)) (modulo MOD), we would have enough information to get N. Here, Q can be calculated in O(K) time per test-case.

P is calculated in the following manner.
Firstly, let us try to solve the following intermediate question:
Given any arbitrary set of K intervals (not necessarily the Ai, Bi as desired in the question), denoted by (Ui, Vi] (intervals exclusive of Ui, inclusive of Vi), let us ask: How many points are there in the corresponding K-dim space : i’th coordinate lies in (Ui, Vi], whose gcd(point’s coordinates) is 1. Let us call this set of points as C(U, V). The answer to our intermediate question is now |C(U, V)|, where U and V specify the intervals’ endpoints we are interested in.

This helps us because

the number of points in (Ai - 1, Bi] intervals whose coordinates’ gcd is R (for arbitrary R) =

the number of points in (floor((Ai-1)/R), floor(Bi/R)] whose coordinates’ gcd is 1.

The above follows from the fact that gcd(x0, x1, x2, …, xK-1) = R <=> gcd(x0/R, x1/R, x2/R, …, xK-1/R) = 1. And the fact that

Ai - 1 < xi <= Bi <=> floor((Ai - 1)/R) < xi/R <= floor(Bi/R)

Thus, P = required sum = ∑ R > 0 (R * |C(floor((Ai-1)/R), floor(Bi/R))|)

We now proceed to calculate |C(U, V)|. Qualitatively speaking, from the universal set of points ARR(U, V), we remove those whose coordinates’ gcd > 1, to get C(U, V).

Let ARR(U, V) = {(x0, x1, …, xK-1) : U* < x_i <= V*}.

Let G(U, V, D) = {(x0, x1, …, xK-1) : D | gcd(x0, x1, …, xK-1), U* < xi <= V*}

From the above 2 definitions, it is easily seen that

C(U, V) = ARR(U, V) \ UNIOND > 1 G(U, V, D)
: from the set of all possible points, remove points whose gcd is divisible by any D > 1.

In fact, we need consider only D as being prime.

Indeed, UNION D > 1 G(U, V, D) = UNION prime D G(U, V, D) since if the gcd of a point is D (> 1), then some prime p|D and that point belongs to the set G(U, V, p). Similarly if p|gcd(point’s coordinates), then that point belongs to the set G(U, V, p). Also, we only consider prime D <= min(V) since for prime D > min(V), G(U, V, D) is ∅.

Now, using the fact that if co-prime m and n are such that m|x and n|x, it is equivalent to (m * n)|x, we get that
given co-prime D1, and D2: G(U, V, D1) ∩ G(U, V, D2) = G(U, V, D1*D2).

The good thing about the set G(U, V, D) is that it is easy to count its size. In fact, |G(U, V, D)| = ∏ (floor(V*/D) - floor(U*/D)).

This is because (x0, x1, …, xK-1) ∈ G(U, V, D)
<=> D|xi, and Ui < xi <= Vi.

|{x : (D|x and x <= n)}| = floor(n/D), which when applied above gives

|{xi : D|xi and Ui < xi <= Vi}| = |{xi : D|xi and xi <= Vi} \ {xi: D|xi and xi <= Ui}| = |{xi : D|xi and xi <= Vi}| - |{xi: D|xi and xi <= Ui}| = floor(Vi/D) - floor (Ui/D).
Since xi’s are chosen independently of other xi’s, |G(U, V, D)| = ∏(floor(Vi/D) - floor(Ui/D)).

And now we are well equipped to apply Principle of Inclusion-Exclusion which gives us:
|C(U, V)| = ∑M = set of distinct primes (-1)^|M| * |G(U, V, ∏ p ∈ M p)|)
= ∑M = set of distinct primes (-1)^|M| * (floor(Vi/∏ p ∈ M p) - floor(Ui/∏ p ∈ M p))

This gives us our first possible algorithm.

P = 0
For R from 1 to N(= min(B0, B1, …, BK-1)
Vi = Bi/R, Ui = (Ai-1)/R
for all m = product of distinct primes s.t. m <= min(floor(N/R))
P += R * (-1)^(number of distinct primes in m) * ∏(floor(Vi/m) - floor(Ui/m))

The above algorithm takes time O(∑(floor(N/j)) * K) per test-case, which is O(N * logN * K) per testcase. This is too slow however. It can be sped up by noticing that product(floor(Vi/m) - floor(Ui/m)) = product (floor(Bi/(m * R)) - floor((Ai-1)/(m * R))). This product can be calculated once for all m * R product pairs in O(NK) time. This now gives us an algorithm which is O(N * logN + N * K) per test-case. This is fast enough to pass.

Also, note that in the above, (-1)^(number of distinct primes in m) = mu(m) where m = product of distinct primes. Also, since mu(m) = 0 if m is not the product distinct primes, the above can also be written as
P = ∑ m, R s.t. m * R <= N (R * mu(m) * ∏(floor(Bi/(m * R)) - floor((Ai-1)/(m * R)))

mu(m) can be precomputed for all m <= 2*10^5 using Sieve.

Also, if you rewrite m * R = Y, then you can write ∑(R * mu(Y/R)) as coeff(Y), giving P = sum ( coeff(Y) * ∏(floor(Bi]/Y) - floor((Ai-1)/Y))).

Using the moebius inversion formula, you get that coeff(Y) is actually phi(Y). So using sieve, we can precompute in time O(N * log log N) the values of phi(i) for all i <= 2*10^5. This gives us a O(N * log log N + T * N * K) time algorithm.

Alternate approach:

This approach gives us directly P = ∑ ( phi(Y) * ∏(floor(Bi/Y) - floor((Ai-1)/Y))) using some clever symbolic manipulation.
Here, we use that n = ∑ d|n (phi(d)).

sum(over Ai <= xi <= Bi) GCD(x0, x1, x2, …, xK-1)

= sum(over Ai <= xi <= Bi) of (sum(over d | GCD(x0, x1, x2, …, xK-1)) phi(d)

= sum(over Ai <= xi <= Bi) of (sum(over d | all of x0, x1, x2, …, xK-1)) phi(d)

= sum(over d <= min(Bi)) of(sum (over d | x0, and d | x1, and d | x2, …, and d | xK-1 and Ai <= xi <= Bi)) phi(d)

= sum(over d <= min(Bi)) of ∏(floor (Bi/d) - floor((Ai-1)/d)) * phi(d)

Author’s Solution:

Can be found here

Tester’s Solution:

Can be found here


#2

plz explain this “another approach” in detail


#3

can anyone explain this line :

|C(U, V)| = ∑M = set of distinct primes (-1)^|M| * |G(U, V, ∏ p ∈ M p)|) = ∑M = set of distinct primes (-1)^|M| * (floor(Vi/∏ p ∈ M p) - floor(Ui/∏ p ∈ M p))

I have spent more than 2 days to understand this editorial , any help would be highly appreciable.
Thank you.


#4

Alternate approach is really smart. Also I think the fact that q is not divisible by MOD is implicitly used in solution, if MOD was less than 200000 BigIntegers would be probably needed.


#5

Yes, that is implicitly used. However, also note that if MOD was less than 200000, then there is a chance that we would not be able to find any N having the P + Q*N divisible by MOD.


#6

Gassa gave another solution as follows:

"First calculate the number of tuples where each element (and therefore GCD) is divisible by 1,2,3,…,N = 200,000.

After that, the number of tuples with GCD being exactly 1,2,3,…,N can be found in an NLogN pass."


#7

@anujkaliaiitd In fact, almost all contestants come up with this “another solution”. If nobody will describe it I will probably do this when I have time. In a few words the phrase “can be found in an NLogN pass” means dp of the form
dp[k] = cnt[k] - dp[2 * k] - dp[3 * k] - ... - dp[[N/k] * k]
where cnt[k] is the number of tuples from the first sentence.


#8

The alternate approach can be optimized by going only through values where value of B* / d or (A* - 1) / d changes.

Getting complexity O(sqrt(min B_i) * K^2) per test case.


#9

got it now


#10

Took me some time, but finally got how the alternative approach works. I saw the lemma before but this is the first time I saw its application. Really beautiful method to make solution faster by ln(n).