MAXIMGCD - Editorial

iceknight1093 · October 27, 2021, 11:59am

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3

Author: Bhavya Girotra
Tester: Radostin Chonev
Editorialist: Nishank Suresh

DIFFICULTY:

Medium

PREREQUISITES:

Observation, familiarity with gcd

PROBLEM:

There is an array of N positive integers.
You can perform at most one move, where you replace two elements of this array by their product.
Find the maximum possible gcd of the resulting array.

QUICK EXPLANATION:

It is enough to consider pairs (i, j) such that i < j and the prefix gcd of the first i elements differs from the prefix gcd of the first i-1.
There are only \mathcal{O}(\log 10^{18}) such positions, giving a solution with \mathcal{O}(N\log 10^{18}) gcd computations, which is already fast enough.
This can be further optimized to \mathcal{O}(N + \log^2 {10^{18}}) gcd computations (though is not required to get AC).

EXPLANATION:

We’ll build the solution up by going through the subtasks.
One important property of gcd to keep in mind is that it is associative, i.e, \gcd(a, b, c) = \gcd(\gcd(a, b), c).

SUBTASK 1

N\leq 2000, so a solution which runs in \mathcal{O}(N^2) is good enough. This means we can check all possible pairs of elements.

Fix a pair of indices i < j which we want to multiply. If we multiply them, what is the gcd of the resulting array?
We can divide the resulting array into 4 parts:

The prefix of length i-1, whose gcd is \gcd(A_1, A_2, \dots, A_{i-1}).
The subarray from i+1 and j-1, with gcd \gcd(A_{i+1}, A_{i+2}, \dots, A_{j-1}).
The suffix of length N - j, with gcd \gcd(A_{j+1}, A_{j+2}, \dots, A_N).
The element A_iA_j.

The overall gcd is simply the gcd of these 4 parts.
To actually compute them, note that (apart from A_iA_j) we only care about the gcd on a contiguous segment of the array.
Thus, simply precomputing all such gcds using dynamic programming suffices to pass this subtask.
To do so, define dp(i, j) = \gcd(A_i, A_{i+1}, \dots, A_j). Then,

dp(i, i) = A_i
dp(i, j) = \gcd(dp(i, j-1), A_j) for i < j.

With this dp table in hand, subtask 1 is solved as described above.

SUBTASK 2

N\leq 200000, so the first solution won’t quite work.
Instead, we will use a different idea.

Look at element A_1. In an optimal solution, either A_1 is multiplied with some other element, or it is not.
Let’s compute the answers for each of these cases separately and take the maximum answer between them.

Case 1: A_1 is multiplied

There are only N-1 possibilities for what to multiply it by here - let’s check all of them.

Suppose we fix an index i > 1. We would like to compute the gcd of A_1A_i, the suffix starting at position i+1, and the subarray between 2 and i-1.
Dealing with the suffix is easy: precompute suffix gcds as suf_i = \gcd(A_i, suf_{i+1}) and this can just be looked up when we need it.

As for the subarray in the middle, we maintain its gcd in a variable as we iterate i from 2 to N, say g. Initially, g = 0.
After computing the required value for some i, set g = \gcd(g, A_i).

This way the value at each index is computed using a constant number of \gcd calls.

Case 2: A_1 is not multiplied

In this case, note that the final answer must be a factor of A_1.
Since A_1 \leq 10^9, we can find all the factors of A_1 in \mathcal{O}(\sqrt{N}) and test for each of them whether it can possibly be the final gcd.

Fix a factor d of A_1.
For any i > 1, if d\mid A_i there is no issue. We only need to worry about the case when d\nmid A_i.

Let S = \left\{i : 2\leq i\leq N \wedge d\nmid A_i\right\}.

If S is empty, d divides the whole array already so the answer is at least d.
If |S|\geq 3, d cannot divide the array no matter what we do, because multiplying two any elements will leave at least one which is not a multiple of d.
If |S| = 2, multiply the elements at those 2 indices and check if d divides them.
If |S| = 1, multiply that element by any other and then d divides the whole array.

Any integer \leq 10^9 has at most 1440 factors, so this works fast enough.

A similar but alternate solution is also possible here - by the pigeonhole principle, one of the first 3 elements will not be multiplied - and hence the final answer must be a factor of one of the first 3 elements.
Now run the above Case 2 solution for each of the first 3 elements, and we are done.
This is slightly slower (in terms of constant factor) than the solution outlined above, but will likely pass in time (at least in C++).

SUBTASK 3

Extending subtask 2 to subtask 3 is unfortunately not possible - numbers as large as 2\cdot 10^{18} can have more than 10^5 factors, and even finding them all quickly enough requires non-trivial algorithms. Finally, even if that was done, testing each one against the whole array would be way too slow, although it might be possible to pass anyway using some heuristics.

Instead we look back at how we solved subtask 1, and improve upon that.

Subtask 1 was solved by testing every pair and multiplying. Is there really a need to test every pair?
Claim: Let P_i = \gcd(A_1, A_2, \dots, A_i), with P_0 = 0. It suffices to test those pairs (i, j) where i is an index such that P_i \neq P_{i-1}.

Proof

Consider some index i such that P_i = P_{i-1}.
The final gcd is definitely going to be a factor of P_{i-1}.

However, we know that P_i = P_{i-1}, i.e, A_i is a multiple of P_{i-1}.
So, irrespective of which index j > i we choose, \gcd(P_{i-1}, A_iA_j) = P_{i-1} = P_i = \gcd(P_{i-1}, A_i); essentially, A_j contributes nothing to the answer.

Thus, we might as well have checked the pair (i-1, j) - the prefix doesn’t lose any factors, the suffix stays exactly the same, and the middle segment has A_i included in it, which does not decrease the answer compared to pair (i, j) because as we showed earlier, \gcd(P_{i-1}, A_iA_j) = \gcd(P_{i-1}, A_i), in other words the answer for (i, j) was already a factor of A_i.

How does this help us? It turns out that there are \leq \log(10^{18}) positions where the prefix gcd changes, because at each step the prefix gcd either remains the same or decreases by at least half.

Thus, we only need to check \mathcal{O}(N\log(10^{18})) pairs of indices, and although each one involves several gcd computataions, it should still fit within the time limit.

Given a pair of indices (i, j) we require the appropriate prefix and suffix gcds, the gcd of the middle segment, and the product A_i A_j.
The prefix and suffix can just be precomputed, and the gcd of the middle segment can be maintained in a variable while iterating j.
The product is a problem though - if A_i and A_j are upto 2\cdot 10^{18}, their product doesn’t even fit into a 64-bit integer!

We use a small trick to overcome this.
Note that we would like to compute \gcd(a, bc) for some values of a, b, c.
We do that by using the following formula:
Let g = \gcd(a, b). Then,

\gcd(a, bc) = g\cdot \gcd(a/g, c)

Intuitively, this just takes all of b's contribution out of a, and then finds the contribution of c to the remaining part.

This much is likely enough to get AC on the third subtask, especially with some early break heuristics.

However, there exists an even faster solution which requires no heuristics whatsoever.

SPEEDING UP SUBTASK 3

We showed that only these indices where the prefix gcd changes matter to be chosen as i - but the exact same logic tells us that the only j which matter are those where the suffix gcd changes.
So, find all indices where either the prefix gcd or the suffix gcd changes, and store them in a set S.
Now it is enough to only consider all pairs of indices from S.

Note that S has at most 2\log{10^{18}} \sim 128 elements, so we only check around 10^4 pairs in total in the worst case.
To make checking these pairs simple, we can do the following:

First, take the \gcd of all elements not in S. Let this be g
Then, run subtask 1 on elements of S, except the initial gcd value for a pair is g instead of 0.

This brings the complexity down to \mathcal{O}(N + \log^2 10^{18}) \gcd computations.

TIME COMPLEXITY:

\mathcal{O}(N\log 10^{18}) or \mathcal{O}(N + \log^2 10^{18}) gcd computations.

CODE:

Setter, Full (C++)

#include <bits/stdc++.h>
using namespace std;
 
#define ll long long
#define mod 1000000007
#define ffor(i,a,b) for(int i=a;i<b;i++)
#define bfor(i,a,b) for(int i=a-1;i>=b;i--)
#define mp make_pair
#define pb push_back
#define ff first
#define ss second
#define mem(x,y) memset(x,y,sizeof(x))
#define all(x) x.begin(),x.end()
#define SP(x) setprecision(x)
#define sz(x) (int)x.size()
#define fast  ios_base::sync_with_stdio(false); cin.tie(NULL); cout.tie(NULL)
#define PI 3.14159265358979323846
#define lb lower_bound
#define ub upper_bound
#define bs binary_search
#define endl '\n'
#define int long long
 
int tryAllPairs(vector<int> v, int currGCD)
{
    int ans = 0;
    int n = v.size() - 2;
 
    int dp[n+2][n+2];
    memset(dp, 0, sizeof(dp));
 
    for(int i=0; i<n+2; i++)
    {
        int gcd=0;
        for(int j=i; j<n+2; j++)
        {
            gcd = __gcd(gcd, v[j]);
            dp[i][j] = gcd;
        }
    }
 
    for(int i=1; i<=n ; i++)
    {
        for(int j=i+1; j<=n; j++)
        {
            int tempGCD = __gcd(currGCD, __gcd(dp[i+1][j-1], __gcd(dp[0][i-1], dp[j+1][n+1])));
 
 
            int gcd = __gcd(tempGCD, v[i]);
            tempGCD /= gcd;
            gcd *= __gcd(tempGCD, v[j]);
 
            ans = max(ans, gcd);
        }
    }
 
    return ans;
}
 
int solve()
{
    int n;
    cin>>n;
 
    int a[n];
    for(int i=0; i<n; i++) cin>>a[i];
 
    if(n==2) return a[0]*a[1];
 
    int vis[n] = {0};
    vector<int> v;
 
    v.push_back(0);
    for(int _=0; _<2; _++)
    {
        int gcd = 0;
        for(int i=0; i<n; i++)
        {
            if(vis[i]) continue;
            int newGCD = __gcd(gcd, a[i]);
            if(newGCD != gcd)
            {
                v.push_back(a[i]);
                vis[i] = 1;
            }
            gcd = newGCD;
        }
    }
    v.push_back(0);
 
    int currGCD = 0;
    for(int i=0; i<n; i++) if(!vis[i]) currGCD = __gcd(currGCD, a[i]);
 
    return tryAllPairs(v, currGCD);
}
 
signed main()
{
    #ifdef LOCAL
    freopen("input.txt", "r", stdin);
    freopen("output.txt", "w", stdout);
    #endif
    
    fast;
 
    int t;
    cin>>t;
    while(t--)
    {
        cout<<solve()<<endl;
    }
 
    return 0;
}

Tester, Full (C++)

#include<bits/stdc++.h>
using namespace std ;

#define MAXN 1000007

int n ;
long long a[ MAXN ] ;
long long pref[ MAXN ] , suff[ MAXN ] ;

vector < int > v ;

long long f ( long long x , long long y ) {
    if ( x < y ) { swap ( x , y ) ; }
    if ( y == 0 ) { return x ; }
    return f ( y , ( x % y ) ) ;
}


void input ( ) {
    scanf ( "%d" , &n ) ;
    for ( int i = 1 ; i <= n ; ++ i ) {
        scanf ( "%lld" , &a[ i ] ) ;
    }
    pref[ 0 ] = suff[ n + 1 ] = 0 ;
    for ( int i = 1 ; i <= n ; ++ i ) {
        pref[ i ] = f ( pref[ i - 1 ] , a[ i ] ) ;
    }
    for ( int i = n ; i >= 1 ; -- i ) {
        suff[ i ] = f ( suff[ i + 1 ] , a[ i ] ) ;
    }
}

long long get_prod ( long long x , long long y , long long aux ) {
    long long ret = f ( x , aux ) ;
    aux /= ret ;
    return ret * f ( y , aux ) ;
}

void solve ( ) {
    v.clear ( ) ;
    for ( int i = 1 ; i <= n ; ++ i ) {
        if ( pref[ i ] != pref[ i - 1 ] ) {
            v.push_back ( i ) ;
        }
    }
    int sz = v.size ( ) ;
    long long ans = 0 ;
    for ( int i = 0 ; i < sz ; ++ i ) {
        long long mid = 0 ;
        for ( int j = v[ i ] + 1 ; j <= n ; ++ j ) {
            if ( suff[ j + 1 ] < ans && suff[ j + 1 ] > 0 ) { break ; }
            if ( mid < ans && mid > 0 ) { break ; }
            long long aux = f ( pref[ v[ i ] - 1 ] , suff[ j + 1 ] ) ;
            aux = f ( aux , mid ) ;
            aux = get_prod ( a[ v[ i ] ] , a[ j ] , aux ) ;
            ans = max ( ans , aux ) ;
            mid = f ( mid , a[ j ] ) ;
        }
        mid = 0 ;
        for ( int j = v[ i ] - 1 ; j >= 1 ; -- j ) {
            if ( suff[ v[ i ] + 1 ] < ans && suff[ v[ i ] + 1 ] > 0 ) { break ; }
            if ( mid < ans && mid > 0 ) { break ; }
            long long aux = f ( pref[ j - 1 ] , suff[ v[ i ] + 1 ] ) ;
            aux = f ( aux , mid ) ;
            aux = get_prod ( a[ v[ i ] ] , a[ j ] , aux ) ;
            ans = max ( ans , aux ) ;
            mid = f ( mid , a[ j ] ) ;
        }
    }
    printf ( "%lld\n" , ans ) ;
}

int main ( ) {
    ios_base :: sync_with_stdio ( false ) ;
    cin.tie ( NULL ) ;
    int t ;
    /// t = 1 ;
    scanf ( "%d" , &t ) ;
    /// cin >> t ;
    while ( t -- ) {
        input ( ) ;
        solve ( ) ;
    }
    return 0 ;
}

Editorialist, Full (C++)

#include "bits/stdc++.h"
// #pragma GCC optimize("O3,unroll-loops")
// #pragma GCC target("sse,sse2,sse3,ssse3,sse4,popcnt,mmx,avx,avx2")
using namespace std;
using ll = long long int;
mt19937_64 rng(chrono::high_resolution_clock::now().time_since_epoch().count());

int main()
{
    ios::sync_with_stdio(0); cin.tie(0);

    int t; cin >> t;
    while (t--) {
        int n; cin >> n;
        vector<ll> v(n+1), pref_gcd(n+1), suf_gcd(n+2);
        for (int i = 1; i <= n; ++i) {
            cin >> v[i];
            pref_gcd[i] = suf_gcd[i] = v[i];
        }
        for (int i = 2; i <= n; ++i)
            pref_gcd[i] = gcd(pref_gcd[i-1], v[i]);
        for (int i = n-1; i >= 1; --i)
            suf_gcd[i] = gcd(suf_gcd[i+1], v[i]);

        auto prod_gcd = [] (ll a, ll b, ll c) {
            // Compute gcd(a, bc)
            ll g = gcd(a, b);
            return g * gcd(a/g, c);
        };
        
        ll ans = 0;
        for (int i = 1; i <= n; ++i) {
            if (pref_gcd[i] == pref_gcd[i-1])
                continue;
            if (pref_gcd[i-1] > 0 and pref_gcd[i-1] <= ans)
                continue;

            ll mid_gcd = pref_gcd[i-1];
            for (int j = i+1; j <= n; mid_gcd = gcd(mid_gcd, v[j++])) {
                if (suf_gcd[j+1] > 0 and suf_gcd[j+1] <= ans) continue;
                ll cur = gcd(suf_gcd[j+1], mid_gcd);
                if (cur > 0 and cur <= ans) continue;
                cur = prod_gcd(cur, v[i], v[j]);
                ans = max(ans, cur);

                if (mid_gcd > 0 and mid_gcd <= ans) break;
            }
        }
        cout << ans << '\n';
    }
}

Editorialist, Subtask 1 (Python)

# Subtask 1

import sys
from math import gcd
input = sys.stdin.readline
for _ in range(int(input())):
    n = int(input())
    a = list(map(int, input().split()))
    pref_gcd = [0]*(n+2)
    suf_gcd = [0]*(n+2)
    for i in range(n):
        pref_gcd[i+1] = suf_gcd[i+1] = a[i]
    for i in range(1, n+1):
        pref_gcd[i] = gcd(pref_gcd[i-1], pref_gcd[i])
        suf_gcd[n+1-i] = gcd(suf_gcd[n+1-i], suf_gcd[n+2-i])
    ans = 0
    for i in range(1, n+1):
        gcd1 = pref_gcd[i-1]
        for j in range(i+1, n+1):
            cur = gcd(gcd1, suf_gcd[j+1])
            cur = gcd(cur, a[i-1] * a[j-1])
            ans = max(ans, cur)

            gcd1 = gcd(gcd1, a[j-1])
    print(ans)

Editorialist, Subtask 2 (C++)

// Subtask 2

#include "bits/stdc++.h"
// #pragma GCC optimize("O3,unroll-loops")
// #pragma GCC target("sse,sse2,sse3,ssse3,sse4,popcnt,mmx,avx,avx2")
using namespace std;
using ll = long long int;
mt19937_64 rng(chrono::high_resolution_clock::now().time_since_epoch().count());

int main()
{
    ios::sync_with_stdio(0); cin.tie(0);

    int t; cin >> t;
    while (t--) {
        int n; cin >> n;
        vector<ll> v(n);
        for (ll &x : v)
            cin >> x;
        sort(begin(v), end(v));
        vector<ll> suf_gcd(n+1);
        for (int i = n-1; i >= 0; --i) {
            suf_gcd[i] = gcd(v[i], suf_gcd[i+1]);
        }
        ll ans = 0, mid_gcd = 0;
        for (int i = 1; i < n; ++i) {
            ll cur = gcd(mid_gcd, suf_gcd[i+1]);
            cur = gcd(cur, v[0] * v[i]);
            ans = max(ans, cur);
            mid_gcd = gcd(mid_gcd, v[i]);
        }
        auto check = [&] (int d) {
            // Can the gcd be a multiple of d?
            bool good = true;
            ll cur = -1, cur2 = -1;
            for (int i = 1; i < n; ++i) {
                if (v[i]%d == 0) continue;
                if (cur == -1) cur = v[i];
                else if (cur2 == -1) cur2 = v[i];
                else good = false;
            }
            if (cur != -1) {
                if (cur2 == -1 or (cur * cur2)%d != 0) good = false;
            }
            return good;
        };
        for (ll d = 1; d*d <= v[0]; ++d) {
            if (v[0]%d) continue;
            if (check(d)) ans = max(ans, d);
            if (check(v[0]/d)) ans = max(ans, v[0]/d);
        }
        cout << ans << '\n';
    }
}

shubu9211 · October 27, 2021, 8:09pm

There is a typo i think. You have written Case 1 : A_i … it should be Case 1 : A₁

iceknight1093 · October 28, 2021, 12:42am

Thanks for noticing, fixed.

nav1285 · October 28, 2021, 3:19am

Any integer <= 10^9 has at most 1440 factors.
Can you explain please how is this result true.
Thanks.

iceknight1093 · October 28, 2021, 3:26am

Look at the sequence of Highly composite numbers.
Sequence A002182 tells us that 1102701600 > 10^9 is highly composite, and it has 1440 factors.

pribic · October 28, 2021, 7:51am

For substak 1, we can also use segment tree to quickly calculate gcd of all the 3 ranges right?

iceknight1093 · October 28, 2021, 8:16am

Yes, you can use a segment tree or sparse table (or any other data structure which works fast enough) to compute the gcd of a range.

That also applies to the other two subtasks.

arpit92_8 · October 29, 2021, 7:39am

i am trying to implement subtask 1 but it is not working can anyone please tell me where i am making the mistake

https://www.codechef.com/viewsolution/53281948

iceknight1093 · October 29, 2021, 7:44am

ans=max(ans,gcd(a,gcd(b,gcd(c,d))));
should be inside the inner loop.

arpit92_8 · October 29, 2021, 7:49am

Thank you very much for the help

anon46960839 · October 29, 2021, 12:41pm

i used a segment tree for the first subtask, but it gave tle for one test case. can someone please help? CodeChef: Practical coding for everyone

iceknight1093 · October 29, 2021, 1:16pm

Using a segment tree adds an extra \mathcal{O}(\log N) factor, which could make things too slow, especially if it’s recursive. An iterative segment tree should be a fair bit faster, and using a sparse table to query ranges in \mathcal{O}(1) should be even faster (though you would need to modify your approach slightly because sparse tables don’t support updates).

In this specific case, reducing the number of segment tree update/query operations manages to get the code to run in just under the time limit (53287374, notice that you did 4 updates and one query per pair, I changed it to two updates and one query).

anon46960839 · October 29, 2021, 5:42pm

Yeah i got it now! Thanks for the reply btw

hackapie · November 9, 2021, 4:13am

Why does the setter’s code runs the outer loop two times while forming the v vector in the solve function? I mean won’t we get the required elements in first run only ? why run it again?