CAMOG - Editorial

iceknight1093 · June 21, 2023, 4:30pm

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: astthecoder
Tester: satyam_343
Editorialist: iceknight1093

DIFFICULTY:

2889

PREREQUISITES:

Dynamic programming, expected value, the inclusion-exclusion principle

PROBLEM:

You’re given an integer N.
For an integer 1 \leq x \leq N, consider the following process:

Pick a random integer y between 1 and N
Replace x by \gcd(x, y).

Let \mathbb{E}[x] be the expected number of moves this process takes till x reaches 1.
For each 1 \leq x \leq N, compute \mathbb{E}[x].

EXPLANATION:

Writing out the expected value for a single x, we see that:

\mathbb{E}[x] = \sum_{y=1}^N \frac{1}{N}\cdot \left(1 + \mathbb{E}[\gcd(x, y)] \right) = 1 + \frac{1}{N}\sum_{y=1}^N \mathbb{E}[\gcd(x, y)]

This lends itself to a rather natural (if slow) recursive approach, because \gcd(x, y) \leq x.
The only issue is those y for which \gcd(x, y) = x, since they lead to ‘loops’.

However, such y lead to an \mathbb{E}[x] term anyway, so they can be taken care of with a bit of algebra.
Suppose there are k such y. Then,

\mathbb{E}[x] = 1 + \frac{k}{N}\mathbb{E}[x] + \frac{1}{N}\sum_{\gcd(x, y) \neq x} \mathbb{E}[\gcd(x, y)] \\ \mathbb{E}[x] \cdot \left(1 - \frac{k}{N}\right) = 1 + \frac{1}{N}\sum_{\gcd(x, y) \neq x} \mathbb{E}[\gcd(x, y)]

The right-hand side here now depends only on values strictly less than x, so a dynamic programing approach can compute it.
Once the RHS is known, \mathbb{E}[x] is easily found.
However, it’s still \mathcal{O}(N^2), so we need to speed it up.

Let’s look at the expression we want to compute.
The expression we want to compute is \sum_{\gcd(x, y) \neq x} \mathbb{E}[\gcd(x, y)].

Let’s fix g = \gcd(x, y) and try to find the number of 1 \leq y \leq N such that \gcd(x, y) = g.
If we know this, we can add this count multiplied by \mathbb{E}[\gcd(x, y)] to the overall sum.
g has to be a divisor of x, which greatly limits the number of g we need to consider.

Writing x = g\cdot s and y = g\cdot t, we have \gcd(g\cdot s, g\cdot t) = g, which means \gcd(s, t) = 1.
Further, g\cdot t \leq N, meaning t \leq \frac{N}{g}

So, we really just want to compute the number of integers in the range [1, \frac{N}{g}] that are coprime to s = \frac{x}{g}.

That can be done using inclusion-exclusion on the prime factors of s.

Details

Let M = \frac{N}{g} be our upper bound.
Let the distinct prime factors of s be p_1, p_2, \ldots, p_k.

Then, we have the following:

Initially, pretend everything in [1, M] is coprime to s.
Anything not coprime to s must have some p_i as a factor.
So, subtract all multiples of p_1, p_2, \ldots, p_k from the count.
These are \left \lfloor \frac{M}{p_1} \right\rfloor, \left \lfloor \frac{M}{p_2} \right\rfloor, \ldots, \left \lfloor \frac{M}{p_k} \right\rfloor respectively.
However, now multiples of, for example, p_1p_2 were counted twice.
So, for each pairwise product of primes, add back in the count of its multiples.
Then, you’ll need to subtract the count of multiples of all triplet products, and so on.

In general, we go through each of the 2^k possibilities of products of prime factors, and either add or subtract the count of its multiples depending on how many primes are involved in the product.

A slightly more detailed explanation can be found here.

Notice that if s has k distinct prime factors, this approach takes 2^k or 2^k \cdot k steps, depending on implementation.
Since N \leq 5\cdot 10^5, any integer \leq N has at most 6 distinct prime factors, so k \leq 6 here, which is pretty small.

We do the 2^k operation for each pair of (number, divisor) below N.
It’s well-known that there are \mathcal{O}(N\log N) such pairs, so a simple upper bound on complexity is \mathcal{O}(2^6 \cdot N\log N).

However, in practice most integers have far fewer than 6 prime factors, so the constant factor is a lot better and your code should run fast.
In particular, even with a 2^k \cdot k algorithm that goes through bitmasks, the number of operations is about 10^8 which is not an issue for C++.

A 2^k algorithm for each pair uses less than 4\cdot 10^7 operations.

TIME COMPLEXITY

\mathcal{O}(2^6 \cdot N\log N) per testcase, but with fairly low constant factor.

CODE:

Author's code (C++)

#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
using namespace std;
using namespace __gnu_pbds;

#define ll long long int
#define pb push_back
#define all(x) x.begin(),x.end()
#define Max 10000000000000000

template <typename T>
using ordered_set = tree<T, null_type, less<T>, rb_tree_tag, tree_order_statistics_node_update>;
template <typename T>
using min_heap=priority_queue<T, vector<T>, greater<T>>;

bool chk[500001];
vector<ll> pc[500001];
ll n,dp[500001],mod=1000000007;

ll bigmod(ll v,ll p){
    if(!p) return 1;
    if(p%2) return (v*bigmod(v,p-1))%mod;
    ll temp=bigmod(v,p/2);
    return (temp*temp)%mod;
}

int main()
{
    //freopen("out/k.in","r",stdin);
    //freopen("out/t.out","w",stdout);
    scanf("%lli",&n);

    for(ll i=1;i<=n;i++) pc[i].pb(1);
    
    for(ll i=2;i<=n;i++){
        if(chk[i]) continue;
        for(ll j=i;j<=n;j+=i){
            chk[j]=1;
            ll temp=pc[j].size();
            for(ll k=0;k<temp;k++){
                pc[j].pb(-i*pc[j][k]);
            }
        }
    }
    
    for(ll i=1;i<=n;i++){
        if(i!=1){
            dp[i]=(dp[i]+n)%mod;
            dp[i]=(dp[i]*bigmod(n-n/i,mod-2))%mod;
        }
        for(ll j=2*i;j<=n;j+=i){
            ll d=n/i;
            ll cp=0;
            for(ll k=0;k<pc[j/i].size();k++){
                cp+=d/pc[j/i][k];
            }
            //cout<<j<<" "<<n<<" "<<i<<" "<<cp<<endl;
            dp[j]=(dp[j]+cp*dp[i])%mod;
        }
        printf("%lli ",dp[i]);
    }
    printf("\n");

    return 0;
}

Tester's code (C++)


 
 
#include <bits/stdc++.h>   
#include <ext/pb_ds/tree_policy.hpp>
#include <ext/pb_ds/assoc_container.hpp>
using namespace __gnu_pbds;   
using namespace std;  
#define ll long long
const ll INF_MUL=1e13;
const ll INF_ADD=2e18; 
#define pb push_back                   
#define mp make_pair          
#define nline "\n"                           
#define f first                                          
#define s second                                             
#define pll pair<ll,ll> 
#define all(x) x.begin(),x.end()     
#define vl vector<ll>             
#define vvl vector<vector<ll>>    
#define vvvl vector<vector<vector<ll>>>          
#ifndef ONLINE_JUDGE    
#define debug(x) cerr<<#x<<" "; _print(x); cerr<<nline;
#else
#define debug(x);    
#endif       
void _print(ll x){cerr<<x;}  
void _print(char x){cerr<<x;}     
void _print(string x){cerr<<x;}    
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());   
template<class T,class V> void _print(pair<T,V> p) {cerr<<"{"; _print(p.first);cerr<<","; _print(p.second);cerr<<"}";}
template<class T>void _print(vector<T> v) {cerr<<" [ "; for (T i:v){_print(i);cerr<<" ";}cerr<<"]";}
template<class T>void _print(set<T> v) {cerr<<" [ "; for (T i:v){_print(i); cerr<<" ";}cerr<<"]";}
template<class T>void _print(multiset<T> v) {cerr<< " [ "; for (T i:v){_print(i);cerr<<" ";}cerr<<"]";}
template<class T,class V>void _print(map<T, V> v) {cerr<<" [ "; for(auto i:v) {_print(i);cerr<<" ";} cerr<<"]";} 
typedef tree<ll, null_type, less<ll>, rb_tree_tag, tree_order_statistics_node_update> ordered_set;
typedef tree<ll, null_type, less_equal<ll>, rb_tree_tag, tree_order_statistics_node_update> ordered_multiset;
typedef tree<pair<ll,ll>, null_type, less<pair<ll,ll>>, rb_tree_tag, tree_order_statistics_node_update> ordered_pset;
//--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
const ll MOD=1e9+7;     
const ll MAX=500500; 
ll binpow(ll a,ll b,ll MOD){
    ll ans=1;
    a%=MOD;
    while(b){
        if(b&1)
            ans=(ans*a)%MOD;
        b/=2;
        a=(a*a)%MOD;
    }
    return ans;
}
ll inverse(ll a,ll MOD){
    return binpow(a,MOD-2,MOD);
} 
void solve(){     
    ll n; cin>>n;
    vector<ll> dp(n+5,0),spf(n+5);
    iota(all(spf),0);
    for(ll i=2;i<=n;i++){
        for(ll j=i;j<=n;j+=i){
            spf[j]=min(spf[j],i);
        }
    }
    auto getv=[&](ll n,ll p,ll g){
        if(p%g){
            return 0ll;
        }
        n/=g;
        p/=g;
        vector<ll> factors;
        while(p!=1){
            ll x=spf[p];
            factors.push_back(x);
            while((p%x)==0){
                p/=x;
            }
        }
        ll len=factors.size(),now=0;
        for(ll mask=0;mask<(1<<len);mask++){
            ll div=1;
            for(ll j=0;j<len;j++){
                if(mask&(1<<j)){
                    div*=(-factors[j]);
                }  
            }
            now+=n/div;
        }
        return now;
    }; 
    for(ll i=1;i<=n;i++){
        debug(mp(i,dp[i])); 
        if(i!=1){
            dp[i]+=n;
            dp[i]=(dp[i]*inverse(n-getv(n,i,i),MOD))%MOD;
        }
        for(ll j=i+i;j<=n;j+=i){
            dp[j]=(dp[j]+dp[i]*getv(n,j,i))%MOD;
        }
        cout<<dp[i]<<" \n"[i==n];
    }
    return;                                
}                                                  
int main()                                                                                                 
{            
    ios_base::sync_with_stdio(false);                             
    cin.tie(NULL);        
    #ifndef ONLINE_JUDGE                     
    freopen("input.txt", "r", stdin);                                                    
    freopen("output.txt", "w", stdout);  
    freopen("error.txt", "w", stderr);                          
    #endif                          
    ll test_cases=1;               
    //cin>>test_cases;
    while(test_cases--){
        solve();
    } 
    cerr<<"Time:"<<1000*((double)clock())/(double)CLOCKS_PER_SEC<<"ms\n"; 
}

Editorialist's code (Python)

mod = 10**9 + 7
n = int(input())

masks = [ [1] for _ in range(n+1)]
invs = [pow(i, mod-2, mod) for i in range(n+1)]

for p in range(2, n+1):
    if len(masks[p]) > 1: continue
    for x in range(p, n+1, p):
        sz = len(masks[x])
        for i in range(sz): masks[x].append(-masks[x][i] * p)

def coprime(a, N): # The count of integers in [1, n] that are coprime to a
    ret = 0
    for x in masks[a]:
        if x > 0: ret += N // x
        else: ret -= N // -x    
    return ret

ops = 0

dp = [0]*(n+1)
for d in range(1, n+1):
    if d > 1:
        dp[d] += 1
        dp[d] = dp[d] * invs[n - n//d] % mod * n % mod

    for x in range(2*d, n+1, d):
        dp[x] += coprime(x//d, n//d) * invs[n] % mod * dp[d] % mod
        if dp[x] >= mod: dp[x] -= mod
print(*dp[1:])