FUZZYLIN - Editorial

anand20 · September 12, 2019, 12:58pm

PROBLEM LINKS :

Contest : Division 1

Contest : Division 2

Practice

Setter : Raj Khandor

Tester : Istvan Nagy

Editorialist : Anand Jaisingh

DIFFICULTY :

Medium

PREREQUISITES :

Bezout’s Identity, GCD properties, Sieve of Eratosthenes.

PROBLEM :

Given an integer array A of size N, you need to answer many queries over this array using an integer K. In a single query, you need to state the number of sub arrays of the given array whose GCD is a divisor of the integer K.

QUICK EXPLANATION :

We can use Bezout’s Identity to prove that for any given sequence b_1,b_2,...,b_m there exists another sequence c_1,c_2,....,c_m, such that \sum_{i=1}^{m} b_i \cdot c_i = K , if and only if GCD(b_1,b_2,...,b_m) is a divisor of the number K. Additionally, we can prove that if we fix the end point of a subarray (Let it be i) and vary it’s beginning points, then there can be at max O(\log(A[i])) distinct values among GCD's of all these subarrays. We can find all these GCD values in \approx \log^2(A[i]) time, leading to an overall runtime of O(N \cdot \log^2(A[i])) time. Finally, we use Sieve of Eratosthenes, so that queries can be easily answered in O(1).

EXPLANATION :

The first part of the problem is identifying for a given integer X, and a set of integers \{b_1,b_2,.....b_m \} , how to know if there exists another set of integers \{ c_1,c_2,....,c_m \}
such that \sum_{i=1}^{m} b_i \cdot c_i = X

The answer my friends is quite well known. Let’s consider the general case, when m=2. We are given 2 integers b_1,b_2 and an integer X. Can we find some c_1,c_2 satisfying our requirements ?

Bezout’s identity tells us that there exist such c_1,c_2 if and only if GCD(b_1,b_2) divides X. If it does, then in fact here exist infinitely pairs (c_1,c_2).

It’s not too difficult to see that we can in fact extend m, and the fact that there exists such (c_1,c_2,....,c_m) exists only if GCD(b_1,b_2,....,b_m) divides X holds true. You can prove this on your own, or read it here

So, our original task now converts to finding for some K, the number of subarrays whose GCD is a divisor of K.

Now, let’s proceed to part 2 :

Claim : For a fixed index i, there exists at most O(log(A[i])) distinct values of subarray GCD’s among all subarrays ending at index i.

Proof :

It’s easy to see that the GCD function is a decreasing function. That is, if i < j <k , then GCD(a[i],a[i+1],,,a[j]) \ge GCD(a[i],a[i+1],....a[k]). And when we transition from j to k, if GCD(a[i],a[i+1],....a[j]) \ne GCD(a[i],a[i+1],....a[k]), then GCD(a[i],a[i+1],....,a[j]) is divided by some positive integer z > 1 to get GCD(a[i],a[i+1],.....,a[k]).

Just consider, if all the numbers (a[i],a[i+1],...a[k]) share a common factor then the numbers (a[i],a[i+1],...a[j]) also share this factor (The GCD) but they can possibly share even more factors even after division by the GCD.

And since the maximum number of times a number a[i] can be divided by a number z > 1 is \log(a[i]), hence proved.

So, if we know how many subarrays ending at index i have GCD equal to X for each possible X (Remember there are only at max O(log(A[i])) ones), then we can find in O(log(a[i+1]) ) time easily, for each Y, the number of subarrays ending at index (i+1) having GCD equal to Y.

The values of Y can only be among the values of GCD(X,a[i]). It easy to see, that if many subarrays all have GCD equal to X and ending at index i, then when we append index i+1 to the end of all the subarrays, then their GCD will be GCD(X,a[i+1]).

In this process, we can know for each possible integer X, the number of subarrays having GCD equal to X over the whole array. Let’s call this val[x].

Now, we just do a simple Sieve, and for some fixed x, add to all multiples of x, \hspace{0.2cm} val[x] . I.e. something of the kind :

\forall \hspace{0.2cm} i, \hspace{0.2cm} \forall \hspace{0.2cm} j \ge 2, val[ i \cdot j] += val[i]

Later, when we need to answer queries, we can answer each of them easily in O(1).

That’s it, thank you !

Your comments are welcome !

COMPLEXITY ANALYSIS :

Time Complexity : O(N \cdot \log^2 A[i] + Q + N \cdot \log N )

Space Complexity : O(N \cdot \log N )

SOLUTION LINKS :

Setter

#include <bits/stdc++.h>
using namespace std;
#define maxK 1000001
//#define DEBUG
 
typedef long long int ll;
ll sieve[maxK];
 
int main()
{
    #ifdef DEBUG
        ifstream cin("4.in");
        ofstream cout("4.out");
    #endif
 
    ll n,g;
    cin>>n;
    ll a[n];
    for(int i=0;i<n;i++)
        cin>>a[i];
    map<ll,ll> m;
    map<ll,ll> gcds;
    for(int i=0;i<n;i++)
    {
        // m contains gcds of all the subarrays ending at index i-1
        // m1 contains gcds of all the subarrays ending at i
        map<ll,ll> m1;
        for(auto itr = m.begin(); itr!= m.end(); itr++)
        {
            g = __gcd(a[i], itr->first);
            m1[g] += itr->second;
            gcds[g] += itr->second;
        }
        m1[a[i]]++;
        gcds[a[i]]++;
        m = m1;
    }
 
    for(int i=0;i<maxK;i++)
        sieve[i] = 0;
    ll i,val;
    ll maxi = INT_MIN;
    for(auto itr = gcds.begin(); itr!=gcds.end(); itr++)
    {
        i = itr->first;
        val = itr->second;
        for(int j=i;j<maxK;j+=i)
            sieve[j] += val;
    }
    for(int i=0;i<maxK;i++)
        maxi = max(sieve[i]*1ll, maxi);
    int q,x;
    cin>>q;
    while(q--)
    {
        cin>>x;
        cout<<sieve[x]<<"\n";
    }
    return 0;
}

Tester

#include <bits/stdc++.h>
 
#define all(x) (x).begin(), (x).end()
#define rall(x) (x).rbegin(), (x).rend()
#define forn(i, n) for (int i = 0; i < (int)(n); ++i)
#define for1(i, n) for (int i = 1; i <= (int)(n); ++i)
#define ford(i, n) for (int i = (int)(n) - 1; i >= 0; --i)
#define fore(i, a, b) for (int i = (int)(a); i <= (int)(b); ++i)
 
template<class T> bool umin(T &a, T b) { return a > b ? (a = b, true) : false; }
template<class T> bool umax(T &a, T b) { return a < b ? (a = b, true) : false; }
 
using namespace std;
 
int gcd(int a, int b)
{
	return b ? gcd(b, a%b) : a;
}
 
int main(int argc, char** argv)
{
#ifdef HOME
	if (IsDebuggerPresent())
	{
		freopen("../in.txt", "rb", stdin);
		freopen("../out.txt", "wb", stdout);
	}
#endif
	int N, Q, K, query;
	scanf("%d", &N);
	const int MAXK = 1e6 + 2;
	const int MAXJ = 20;
	vector<int> A(N);
	vector<int64_t> B(MAXK);
	vector<vector<int> > AA(MAXJ, vector<int>(N));
	for (uint32_t i = 0; i < N; ++i)
	{
		scanf("%d", &A[i]);
	}
 
	AA[0] = A;
 
	for (uint32_t j = 1; (1 << j) <= N; ++j)
	{
		auto& actA = AA[j];
		for (uint32_t i = 0; i < N; ++i) if(i + (1<<j) <= N)
		{
			actA[i] = gcd(AA[j - 1][i], AA[j - 1][i + (1 << (j-1))]);
		}
	}
 
	for (uint32_t i = 0; i < N; ++i)
	{
		int act = A[i];
		int actP = i, lastP = i;
		++actP;
		while (actP < N && act > 1)
		{
			int newP = actP;
			for (uint32_t j = 0; actP + (1 << j) <= N; ++j)
			{
				if (gcd(act, AA[j][actP]) != act)
				{
					break;
				}
				else
				{
					newP = actP + (1 << j);
				}
			}
			if (newP == actP)
			{
				if (act < MAXK) B[act]+=(actP-lastP);
				act = gcd(act, A[actP]);
				lastP = actP;
				++actP;
			}
			else
			{
				actP = newP;
			}
		}
		if (act < MAXK) B[act] += (N - lastP);
	}
	
	scanf("%d", &Q);
	for (uint32_t q = 0; q < Q; ++q)
	{
		scanf("%d", &query);
		int64_t ans = 0;
		for (int i = 1; i * i <= query; ++i) if(query % i == 0)
		{
			ans += B[i];
			if(i*i != query)
				ans += B[query/i];
		}
		printf("%d\n", ans);
	}
	return 0;
}

Editorialist

//#pragma GCC optimize("Ofast,no-stack-protector")
//#pragma GCC target("sse,sse2,sse3,ssse3,sse4,popcnt,abm,mmx")
//#pragma GCC target("avx,tune=native")
// Anand Jaisingh

#include<bits/stdc++.h>

using namespace std;

typedef complex<double> base;
typedef long double ld;
typedef long long ll;

#define pb push_back
#define pii pair<int,int>
#define pll pair< ll , ll >
#define vi vector<int>
#define vvi vector< vi >
const int maxn=(int)(2e5+5);
const ll mod=(ll)(1e9+7);
int a[maxn];
ll res[1000006];

int main()
{
    ios_base::sync_with_stdio(0);cin.tie(0);

    int n;cin>>n;

    for(int i=0;i<n;i++)
    {
        cin>>a[i];
    }

    map<int,ll> m1;

    for(int i=0;i<n;i++)
    {
        map<int,ll> m2;

        for(auto x:m1)
        {
            int now=__gcd(x.first,a[i]);

            m2[now]+=m1[x.first];
        }

        m2[a[i]]++;

        for(auto x:m2)
        {
            if(x.first<=1000000)
            {
                res[x.first]+=x.second;
            }
        }

        m1=m2;
    }

    /*

    for(int i=1;i<=10;i++)
    {
        cout<<res[i]<<" ";
    }

     */

    cout<<endl;

    for(int i=1000000;i>=1;i--)
    {
        for(int j=i+i;j<=1000000;j+=i)
        {
            res[j]+=res[i];
        }
    }

    int q;cin>>q;

    while(q>0)
    {
        int x;cin>>x;

        cout<<res[x]<<endl;

        q--;
    }

    return 0;
}

newtech66 · September 17, 2019, 3:21am

I had a slightly different approach to getting the GCD values, using a segment tree to get range GCD and using binary search to find all points where the GCD changed for every position. From there on my solution is about the same as editorial.

s5960r · September 17, 2019, 5:34am

Yup! My solution is also using Sparse Table and binary search

https://www.codechef.com/viewsolution/26520870

Btw some people also got AC using optimized O(n²)

ashforever007 · September 17, 2019, 6:04am

Why does this give TLE
https://www.codechef.com/viewsolution/26351498

newtech66 · September 17, 2019, 6:11am

Well maybe you need to do some optimisations. I used the iterative implementation of segtree. Also maybe change some long longs to ints, that might help.

bazzyadb123 · September 17, 2019, 6:46am

Hey @s5960r your code gives 13 as output
for
5
1 2 3 4 1
1
2
I expect output to be 14.Can you correct me if i am wrong

ssjgz · September 17, 2019, 7:23am

I also get 13.

bazzyadb123 · September 17, 2019, 7:32am

Can you explain why According to me all subarrays are valid except 1 which contains 3 only.Can you explain what i am missing?
@ssjgz valid subarray means the subarray which can be used to generate answer to the given equation

ssjgz · September 17, 2019, 7:38am

What do you mean by a subarray is “valid”?

aert · September 17, 2019, 7:43am

GCD of subarray with only 4 is 4. 2 is not divisible by 4.

s5960r · September 17, 2019, 8:40am

[1]
[1,2]
[1,2,3]
[1,2,3,4]
[1,2,3,4,1]
[2]
[2,3]
[2,3,4]
[2,3,4,1]
[3,4]
[3,4,1]
[4,1]
[1]

These are all “valid” subrrays, i generated them by bruteforce. and they are 13

diegohdmgz · September 17, 2019, 3:34pm

Isn’t it O(N \times log^{2}(N) \times log^2(A[i]) ) ? (only the construction without the sieve)

For each of the N elements of the array you run a binary search O(log A[i]) times. Each binary search runs in O(log N \times |st|) , where |st| is the Segment tree query complexity.

The segment tree query is usually O(logN) , but also you have the complexity of the gcd function, so the total complexity of the query is O(log N \times logA[i] ).

If you use Sparse Table, complexity will be O(N \times log(N) \times log^2(A[i]) ) .

I remember using the same idea (with Sparse Table) in this problem of GYM CF : C - Collatz Conjecture and I got TLE.

Maybe there weren’t strong testcases.

anand20 · September 17, 2019, 4:40pm

The solution is O(N \cdot \log^3 N), and passes due to weak tests.

newtech66 · September 17, 2019, 4:50pm

I didn’t analyse the time complexity well. Nice analysis. RIP for me I guess.

ssrivastava990 · September 17, 2019, 7:40pm

can anyone help me with this :
https://www.codechef.com/viewsolution/26614356
I know my solution is very bad , but i still didn’t understand editorial :( ,
so just tell me if the question is that we have to map the gcd of all subarrays , then how we do?
ex: A = [2,4]
subarrays -
2 → gcd(2) = 2 mp[2]—>1
(2,4)->gcd(2,4) = 2 mp[2]–>2
4–>gcd(4)=4 mp[4]—>1

so map is
2–>2 times
4–>1 times

@aert @ssjgz @s5960r @L_returns @bazzyadb123

l_returns · September 18, 2019, 1:30am

After finding gcd instead of iterating all divisors at runtime, use sieve to precompute answer for each valid K.
for(i=1;i<max;i++)
for(j=I;j<max;j+=i)
ans[j]+=freq[i]

ssrivastava990 · September 18, 2019, 5:28am

Means my above code to store gcd is good right…

s5960r · September 18, 2019, 6:03am

@ssrivastava990 Bro i think your GCD calculation time complexity is O(n^2) , that’s what giving you TLE

I’ll tell you my method, you basically need to reduce that O(n^2)

The question can be divided into two parts

Task 1. For all possible subarrays, find their GCD and no of times the occur i.e frequency

Task 2. like @l_returns said, use sieve method for queries. I’ll explain my answer in detail.

My Solution which i’m refering

Line No 162 to 165 => Variable Declaration

Line No 166 to 169 => I build my sparse table, that can give me GCD of any range in log(n) time

Line No 170=> I create a map which would map a GCD with its frequency

Line No 171 to 182 => THIS IS WHERE I CALCULATE GCD WITH THEIR FREQUENCY

Basically GCD if calculated from a index, will either decrease or stay constant. Hence it’s monotonic in nature, so i can use binary search.

Line No 142 to 154 => MY Binary Search function that returns the index of the position where GCD decreases

I start with any index and call it my original GCD. Now I binary search the index where it decreases. Now i use my original index and that index to calculate the increase in the frequency of that GCD, by using the difference in index.

So in nlog(n) time i calculated all possible GCD with their frequency.

So Task 1 Complete Yay!

Now coming to task #2, it’s easy

Line No 186=> I call a function, fillDp();

Line No 197 to 202 => I’m using the sieve method here, i take any GCD (say X) and i add its frequency to all multiples of that GCD, simple! You can learn more about Sieve method from internet

So for all query ‘q’ , we simply outout the dp[q].

Task 2 accomplished ! and time for AC

Let me know if any part of solution is unclear?

malpani1199 · September 18, 2019, 6:06am

Can someone please help me understand this
“(Remember there are only at max O(log(A[i]))” this phrase. I had thought this idea thoroughly during the contest but i was thinking it as in start i will have 1 subarray, then 2 subarrays, then 3 subarrays and so on till n-1 subarrays.
Some of them will have same gcd values but how this sum reduces to nlogn from n^2
For making the gcd frequency array how we won’t take order n^2 time?

s5960r · September 18, 2019, 6:09am

It’ means that If you have a array say
[n1,n2,n3,n4.....nn]

and you start to take total GCD from a number say n2 , then the total GCD will only decrease
and it can decrease to a maximum of log(n2) distinct values.

Now you understand?