TREEDIV - Editorial

PROBLEM LINK:

Contest
Practice
Setter: iceknight1093,, jainmilind
Testers: tabr
Editorialist: aryanag_adm

DIFFICULTY:

2782

PREREQUISITES:

Factorisation, Small to Large

PROBLEM:

You are given a rooted tree T with N vertices. Each vertex v has an integer A_v associated with it. For each v, you need to output the number of distinct divisors of \prod_{u \in \text{subtree of }v} A_u.

EXPLANATION:

We can precompute the prime factorisation of all numbers between 1 and 10^6 in 10^6 \times \log 10^6 steps, which is fast enough. We will assume we have these prime factorisations available.

Now, if x = \prod_{i=1}^k p_i^{y_i} where p_i are prime and y_i are positive integers, then the number of distinct divisors of x is \prod_{i=1}^k (y_i + 1).

Now, let’s say we are given the prime factorisation (modulo 10^9 + 7) and number of divisors d of a large number X. We want to compute the prime factorisation and number of divisors d' of X \cdot x, where x \leq 10^6. Let’s say X = \prod_{p \in P} p^{y_p}, x = \prod_{p \in P'} p^{z_p}. Then d' = d \cdot (\prod_{p \in P' \setminus P} {z_p + 1}) \cdot (\prod_{p \in P \cap P'} \frac{x_p + y_p + 1}{y_p + 1}). The complete prime factorisation can be similarly computed. Note that there are at most P' algebraic operations here, and P' \leq \log 10^6 < 20. Therefore, we can do this fast.

Now our algorithm is easy. If v is a leaf, the answer can be computed directly using prime factorisation of A_v. For any non leaf node v, we can use small-to-large along with the merging algorithm we saw above to compute the answer fast.

TIME COMPLEXITY:

Roughly, the time complexity is O(1e6 \log 1e6 + 20 N \log^3 N ).

SOLUTION:

Editorialist's Solution
#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
//#include <sys/resource.h>
#define int long long
#define double long double
#define initrand mt19937 mt_rand(time(0));
#define rand mt_rand()
#define MOD 1000000007
#define INF 1000000000
#define mid(l, u) ((l+u)/2)
#define rchild(i) (i*2 + 2)
#define lchild(i) (i*2 + 1)
#define mp(a, b) make_pair(a, b)
#define lz lazup(l, u, i);
#define ordered_set tree<pair<int, int>, null_type,less<pair<int, int>>, rb_tree_tag,tree_order_statistics_node_update>
using namespace std;
using namespace __gnu_pbds;
const int maxn = 3e4 + 5;
int ans[maxn], A[maxn], sz[maxn], st[maxn], ft[maxn], ansd[(int)1e6 + 1], tt = 0, currdiv = 1;;
vector<int> adj[maxn], ver;
vector<pair<int, int>> divisors[(int)1e6 + 1];
bool notPrime[(int)1e6 + 1];
map<int, int> currfac;
int ex(int a, int b){
    if(b==0) return 1;
    int c = ex(a, b/2);
    c *= c;
    c %= MOD;
    if(b%2) c *= a;
    c %= MOD;
    return c;
}
int inverse(int x){
    return ex(x, MOD - 2);
}
int d(int a, int b){
    return (a*inverse(b))%MOD;
}
int m(int a, int b){
    return (a*b)%MOD;
}
void getsz(int v, int p){
    st[v] = tt++;
    ver.push_back(v);
    sz[v] = 1;  // every vertex has itself in its subtree
    for(auto u : adj[v])
        if(u != p){
            getsz(u, v);
            sz[v] += sz[u]; // add size of child u to its parent(v)
        }
    ft[v] = tt;
}
void dfs(int v, int p, bool keep){
    int mx = -1, bigChild = -1;
    for(auto u : adj[v])
        if(u != p && sz[u] > mx)
            mx = sz[u], bigChild = u;
    for(auto u : adj[v])
        if(u != p && u != bigChild)
            dfs(u, v, 0);  // run a dfs on small childs and clear them from cnt
    if(bigChild != -1)
        dfs(bigChild, v, 1);  // bigChild marked as big and not cleared from cnt
    for(auto u : adj[v])
        if(u != p && u != bigChild)
            for(int p = st[u]; p < ft[u]; p++)
                for(pair<int, int> j: divisors[A[ver[p]]]){
                    currdiv = d(currdiv, currfac[j.first] + 1);
                    currfac[j.first] += j.second;
                    currdiv = m(currdiv, currfac[j.first] + 1);
                }
    for(pair<int, int> j: divisors[A[v]]){
        currdiv = d(currdiv, currfac[j.first] + 1);
        currfac[j.first] += j.second;
        currdiv = m(currdiv, currfac[j.first] + 1);
    }
    ans[v] = currdiv;
    //now cnt[c] is the number of vertices in subtree of vertex v that has color c. You can answer the queries easily.
    if(keep == 0) {
        currdiv = 1;
        currfac.clear();
    }
}
signed main(){
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);
    cout.tie(NULL);
    for(int i = 1;i<=1e6;i++) ansd[i] = 1;
    for(int i = 2;i<=1e6;i++){
        if(notPrime[i]) continue;
        for(int j = i;j<=1e6;j+=i){
            notPrime[j] = true;
            int mult = 0;
            int temp = j;
            while((temp%i) == 0){
                temp /= i;
                mult++;
            }
            ansd[j] *= (mult + 1);
            ansd[j] %= MOD;
            divisors[j].push_back({i, mult});
        }
    }
    int t;
    cin>>t;
    while(t--){
        int n;
        cin>>n;
        ver.clear();
        currfac.clear();
        tt = 0;
        for(int i = 0;i<n;i++) adj[i].clear();
        for(int i = 0;i<n;i++) cin>>A[i];
        for(int i = 1;i<n;i++){
            int u, v;
            cin>>u>>v;
            u--;
            v--;
            adj[u].push_back(v);
            adj[v].push_back(u);
        }
        tt = 0;
        getsz(0, 0);
        currdiv = 1;
        dfs(0, 0, 9);
        for(int i = 0;i<n;i++){
            cout<<ans[i]<<" ";
        }
        cout<<endl;
    }
}

Where do the three logs in the complexity come from? I guess one is from small to large merging and another from the map. What am I missing?

I guess the time complexity of binary search is also included

I have a O(N\log N+10^6 \log 10^6) solution.

1 Like

One comes from small to large merging, one comes from the map, and the final one comes from finding inverses modulo 1e9 + 7.

Although finding inverses actually takes \log 10^9 + 7 time, not \log N. The map also, I believe, takes at most \log 10^6 (number of elements in the map) time to access, not \log N. All of these are \leq 30 so I just wrote \log N for simplicity.

1 Like

Can you share your approach? Or you just optimized the same approach with precomputation and unordered_maps?

Shouldn’t x be equal to multiplication of p_i ^ y_i. Instead of summation? Thanks in advance.

Yeah, I too think so
Maybe it’s a typo

Yup, fixed. Thanks for pointing it out.

I get how the merging works if one of the nodes is a leaf node. But when both the nodes are not leaf nodes, how is the number of operations limited to 20?

Here’s an analysis of a successful Java solution I found in the public submissions:

I had TLE. The 2 optimizations tricks from the submission should help in similar situations in the future where you need to merge large Hash tables to compute the root node value.