CAMOG - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: astthecoder
Tester: satyam_343
Editorialist: iceknight1093

DIFFICULTY:

2889

PREREQUISITES:

Dynamic programming, expected value, the inclusion-exclusion principle

PROBLEM:

You’re given an integer N.
For an integer 1 \leq x \leq N, consider the following process:

  • Pick a random integer y between 1 and N
  • Replace x by \gcd(x, y).

Let \mathbb{E}[x] be the expected number of moves this process takes till x reaches 1.
For each 1 \leq x \leq N, compute \mathbb{E}[x].

EXPLANATION:

Writing out the expected value for a single x, we see that:

\mathbb{E}[x] = \sum_{y=1}^N \frac{1}{N}\cdot \left(1 + \mathbb{E}[\gcd(x, y)] \right) = 1 + \frac{1}{N}\sum_{y=1}^N \mathbb{E}[\gcd(x, y)]

This lends itself to a rather natural (if slow) recursive approach, because \gcd(x, y) \leq x.
The only issue is those y for which \gcd(x, y) = x, since they lead to ‘loops’.

However, such y lead to an \mathbb{E}[x] term anyway, so they can be taken care of with a bit of algebra.
Suppose there are k such y. Then,

\mathbb{E}[x] = 1 + \frac{k}{N}\mathbb{E}[x] + \frac{1}{N}\sum_{\gcd(x, y) \neq x} \mathbb{E}[\gcd(x, y)] \\ \mathbb{E}[x] \cdot \left(1 - \frac{k}{N}\right) = 1 + \frac{1}{N}\sum_{\gcd(x, y) \neq x} \mathbb{E}[\gcd(x, y)]

The right-hand side here now depends only on values strictly less than x, so a dynamic programing approach can compute it.
Once the RHS is known, \mathbb{E}[x] is easily found.
However, it’s still \mathcal{O}(N^2), so we need to speed it up.

Let’s look at the expression we want to compute.
The expression we want to compute is \sum_{\gcd(x, y) \neq x} \mathbb{E}[\gcd(x, y)].

Let’s fix g = \gcd(x, y) and try to find the number of 1 \leq y \leq N such that \gcd(x, y) = g.
If we know this, we can add this count multiplied by \mathbb{E}[\gcd(x, y)] to the overall sum.
g has to be a divisor of x, which greatly limits the number of g we need to consider.

Writing x = g\cdot s and y = g\cdot t, we have \gcd(g\cdot s, g\cdot t) = g, which means \gcd(s, t) = 1.
Further, g\cdot t \leq N, meaning t \leq \frac{N}{g}

So, we really just want to compute the number of integers in the range [1, \frac{N}{g}] that are coprime to s = \frac{x}{g}.

That can be done using inclusion-exclusion on the prime factors of s.

Details

Let M = \frac{N}{g} be our upper bound.
Let the distinct prime factors of s be p_1, p_2, \ldots, p_k.

Then, we have the following:

  • Initially, pretend everything in [1, M] is coprime to s.
  • Anything not coprime to s must have some p_i as a factor.
    So, subtract all multiples of p_1, p_2, \ldots, p_k from the count.
    These are \left \lfloor \frac{M}{p_1} \right\rfloor, \left \lfloor \frac{M}{p_2} \right\rfloor, \ldots, \left \lfloor \frac{M}{p_k} \right\rfloor respectively.
  • However, now multiples of, for example, p_1p_2 were counted twice.
    So, for each pairwise product of primes, add back in the count of its multiples.
  • Then, you’ll need to subtract the count of multiples of all triplet products, and so on.

In general, we go through each of the 2^k possibilities of products of prime factors, and either add or subtract the count of its multiples depending on how many primes are involved in the product.

A slightly more detailed explanation can be found here.

Notice that if s has k distinct prime factors, this approach takes 2^k or 2^k \cdot k steps, depending on implementation.
Since N \leq 5\cdot 10^5, any integer \leq N has at most 6 distinct prime factors, so k \leq 6 here, which is pretty small.


We do the 2^k operation for each pair of (number, divisor) below N.
It’s well-known that there are \mathcal{O}(N\log N) such pairs, so a simple upper bound on complexity is \mathcal{O}(2^6 \cdot N\log N).

However, in practice most integers have far fewer than 6 prime factors, so the constant factor is a lot better and your code should run fast.
In particular, even with a 2^k \cdot k algorithm that goes through bitmasks, the number of operations is about 10^8 which is not an issue for C++.

A 2^k algorithm for each pair uses less than 4\cdot 10^7 operations.

TIME COMPLEXITY

\mathcal{O}(2^6 \cdot N\log N) per testcase, but with fairly low constant factor.

CODE:

Author's code (C++)
#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
using namespace std;
using namespace __gnu_pbds;

#define ll long long int
#define pb push_back
#define all(x) x.begin(),x.end()
#define Max 10000000000000000

template <typename T>
using ordered_set = tree<T, null_type, less<T>, rb_tree_tag, tree_order_statistics_node_update>;
template <typename T>
using min_heap=priority_queue<T, vector<T>, greater<T>>;

bool chk[500001];
vector<ll> pc[500001];
ll n,dp[500001],mod=1000000007;

ll bigmod(ll v,ll p){
    if(!p) return 1;
    if(p%2) return (v*bigmod(v,p-1))%mod;
    ll temp=bigmod(v,p/2);
    return (temp*temp)%mod;
}

int main()
{
    //freopen("out/k.in","r",stdin);
    //freopen("out/t.out","w",stdout);
    scanf("%lli",&n);

    for(ll i=1;i<=n;i++) pc[i].pb(1);
    
    for(ll i=2;i<=n;i++){
        if(chk[i]) continue;
        for(ll j=i;j<=n;j+=i){
            chk[j]=1;
            ll temp=pc[j].size();
            for(ll k=0;k<temp;k++){
                pc[j].pb(-i*pc[j][k]);
            }
        }
    }
    
    for(ll i=1;i<=n;i++){
        if(i!=1){
            dp[i]=(dp[i]+n)%mod;
            dp[i]=(dp[i]*bigmod(n-n/i,mod-2))%mod;
        }
        for(ll j=2*i;j<=n;j+=i){
            ll d=n/i;
            ll cp=0;
            for(ll k=0;k<pc[j/i].size();k++){
                cp+=d/pc[j/i][k];
            }
            //cout<<j<<" "<<n<<" "<<i<<" "<<cp<<endl;
            dp[j]=(dp[j]+cp*dp[i])%mod;
        }
        printf("%lli ",dp[i]);
    }
    printf("\n");

    return 0;
}
Tester's code (C++)

 
 
#include <bits/stdc++.h>   
#include <ext/pb_ds/tree_policy.hpp>
#include <ext/pb_ds/assoc_container.hpp>
using namespace __gnu_pbds;   
using namespace std;  
#define ll long long
const ll INF_MUL=1e13;
const ll INF_ADD=2e18; 
#define pb push_back                   
#define mp make_pair          
#define nline "\n"                           
#define f first                                          
#define s second                                             
#define pll pair<ll,ll> 
#define all(x) x.begin(),x.end()     
#define vl vector<ll>             
#define vvl vector<vector<ll>>    
#define vvvl vector<vector<vector<ll>>>          
#ifndef ONLINE_JUDGE    
#define debug(x) cerr<<#x<<" "; _print(x); cerr<<nline;
#else
#define debug(x);    
#endif       
void _print(ll x){cerr<<x;}  
void _print(char x){cerr<<x;}     
void _print(string x){cerr<<x;}    
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());   
template<class T,class V> void _print(pair<T,V> p) {cerr<<"{"; _print(p.first);cerr<<","; _print(p.second);cerr<<"}";}
template<class T>void _print(vector<T> v) {cerr<<" [ "; for (T i:v){_print(i);cerr<<" ";}cerr<<"]";}
template<class T>void _print(set<T> v) {cerr<<" [ "; for (T i:v){_print(i); cerr<<" ";}cerr<<"]";}
template<class T>void _print(multiset<T> v) {cerr<< " [ "; for (T i:v){_print(i);cerr<<" ";}cerr<<"]";}
template<class T,class V>void _print(map<T, V> v) {cerr<<" [ "; for(auto i:v) {_print(i);cerr<<" ";} cerr<<"]";} 
typedef tree<ll, null_type, less<ll>, rb_tree_tag, tree_order_statistics_node_update> ordered_set;
typedef tree<ll, null_type, less_equal<ll>, rb_tree_tag, tree_order_statistics_node_update> ordered_multiset;
typedef tree<pair<ll,ll>, null_type, less<pair<ll,ll>>, rb_tree_tag, tree_order_statistics_node_update> ordered_pset;
//--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
const ll MOD=1e9+7;     
const ll MAX=500500; 
ll binpow(ll a,ll b,ll MOD){
    ll ans=1;
    a%=MOD;
    while(b){
        if(b&1)
            ans=(ans*a)%MOD;
        b/=2;
        a=(a*a)%MOD;
    }
    return ans;
}
ll inverse(ll a,ll MOD){
    return binpow(a,MOD-2,MOD);
} 
void solve(){     
    ll n; cin>>n;
    vector<ll> dp(n+5,0),spf(n+5);
    iota(all(spf),0);
    for(ll i=2;i<=n;i++){
        for(ll j=i;j<=n;j+=i){
            spf[j]=min(spf[j],i);
        }
    }
    auto getv=[&](ll n,ll p,ll g){
        if(p%g){
            return 0ll;
        }
        n/=g;
        p/=g;
        vector<ll> factors;
        while(p!=1){
            ll x=spf[p];
            factors.push_back(x);
            while((p%x)==0){
                p/=x;
            }
        }
        ll len=factors.size(),now=0;
        for(ll mask=0;mask<(1<<len);mask++){
            ll div=1;
            for(ll j=0;j<len;j++){
                if(mask&(1<<j)){
                    div*=(-factors[j]);
                }  
            }
            now+=n/div;
        }
        return now;
    }; 
    for(ll i=1;i<=n;i++){
        debug(mp(i,dp[i])); 
        if(i!=1){
            dp[i]+=n;
            dp[i]=(dp[i]*inverse(n-getv(n,i,i),MOD))%MOD;
        }
        for(ll j=i+i;j<=n;j+=i){
            dp[j]=(dp[j]+dp[i]*getv(n,j,i))%MOD;
        }
        cout<<dp[i]<<" \n"[i==n];
    }
    return;                                
}                                                  
int main()                                                                                                 
{            
    ios_base::sync_with_stdio(false);                             
    cin.tie(NULL);        
    #ifndef ONLINE_JUDGE                     
    freopen("input.txt", "r", stdin);                                                    
    freopen("output.txt", "w", stdout);  
    freopen("error.txt", "w", stderr);                          
    #endif                          
    ll test_cases=1;               
    //cin>>test_cases;
    while(test_cases--){
        solve();
    } 
    cerr<<"Time:"<<1000*((double)clock())/(double)CLOCKS_PER_SEC<<"ms\n"; 
}   
Editorialist's code (Python)
mod = 10**9 + 7
n = int(input())

masks = [ [1] for _ in range(n+1)]
invs = [pow(i, mod-2, mod) for i in range(n+1)]

for p in range(2, n+1):
    if len(masks[p]) > 1: continue
    for x in range(p, n+1, p):
        sz = len(masks[x])
        for i in range(sz): masks[x].append(-masks[x][i] * p)

def coprime(a, N): # The count of integers in [1, n] that are coprime to a
    ret = 0
    for x in masks[a]:
        if x > 0: ret += N // x
        else: ret -= N // -x    
    return ret

ops = 0

dp = [0]*(n+1)
for d in range(1, n+1):
    if d > 1:
        dp[d] += 1
        dp[d] = dp[d] * invs[n - n//d] % mod * n % mod

    for x in range(2*d, n+1, d):
        dp[x] += coprime(x//d, n//d) * invs[n] % mod * dp[d] % mod
        if dp[x] >= mod: dp[x] -= mod
print(*dp[1:])
1 Like