CHEFDIVISION - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: ziad_el_gafy
Tester: jay_1048576
Editorialist: iceknight1093

DIFFICULTY:

3025

PREREQUISITES:

Subtree updates/queries on a tree; segment trees with lazy propagation

PROBLEM:

You’re given a tree with N vertices; the i-th has value v_i.
Support the following updates and queries on it:

  1. Given u and p, multiply the values of all values in the subtree of u, by v_i.
  2. Given u and p, fully divide the values of all values in the subtree of u, by v.
    Here, fully divide means “while x is divisible by p, divide it by p”.
  3. Given u, count the number of primes that divide all values in the subtree of u.

it is known that p, v_i \leq 700.

EXPLANATION:

The third query suggests that looking at the prime factorizations of the values is helpful.
Looking at the how the updates affect values in terms of their factorizations:

  • The first update means “all values in the subtree of u now have p as a prime factor”.
  • The second update means “all values in the subtree of u no longer have p as a prime factor”.

This means the prime factorizations of each element are rather boolean in nature: for each element, we only need to know which primes exist; we don’t care about their powers.

Since p, v_i \leq 700, there are only 125 primes we care about in total.
So, let’s represent the prime factorization of each element by a 125-bit bitmask.
Let M_i denote the mask for vertex i.
This bitmask can be represented using two 64-bit integers in most languages (or, in C++, a bitset<125>).

Translating our updates and queries to this bitmask:

  • Update 1 means “set a certain bit to 1 for all masks in the subtree of u”.
    This can be represented by a bitwise OR operation.
  • Update 2 means “set a certain bit to 0 for all masks in the subtree of u”.
    This can be represented by a bitwise AND operation.
  • Update 3 means “find the number of bits that are set in all masks in the subtree of u”.
    This can be thought of as “find the bitwise AND of all masks in the subtree of u, then count the number of set bits”.

We now need to perform these efficiently.


First off, as always we can transform subtree operations into subarray operations by flattening the tree via an Euler tour.
Now, we need to support the following. We have an array M of length N. Then,

  • Given (L, R, x), set M_i \to M_i \mid x for all L \leq x \leq R.
  • Given (L, R, x), set M_i \to M_i \ \&\ x for all L \leq x \leq R.
  • Given (L, R), compute the bitcount of (M_L \ \& \ M_{L+1} \ \& \ \ldots \ \& \ M_R).

Since we’re performing lazy propagation, we need to be able to combine updates; and this is somewhat non-trivial because there are two completely different types of updates.

However, it is indeed possible to do so.
For each bit, there are three possible updates to it:

  • Keep it the same
  • Set it to 1
  • Set it to 0

So, each update is a ternary state.
Combining these ternary states, however, can be done using just binary operations as follows:

  • Each update type consists of two bitmasks x and y.
    By default, x has all its bits set and y = 0.
  • Then, to push an update (x_1, y_1) from a parent node in the segment tree to its child whose update is currently (x_2, y_2), do the following:
    • Set x_2 \to (x_2 \ \& \ x_1) \mid y_1
    • Set y_2 \to (y_2 \ \& \ x_1)\mid y_1

This process of performing the AND update before the OR one is always correct, and can be verified manually against all 3\times 3 possibilities of combining updates for a single bit.

Since we’re using purely bitwise operations, this is now just standard lazy propagation on a segment tree whose time complexity is \mathcal{O}(Q\log N), which is fast enough.

TIME COMPLEXITY

\mathcal{O}(N + Q\log N) per testcase.

CODE:

Tester's code (C++)
/*...................................................................*
 *............___..................___.....____...______......___....*
 *.../|....../...\........./|...../...\...|.............|..../...\...*
 *../.|...../.....\......./.|....|.....|..|.............|.../........*
 *....|....|.......|...../..|....|.....|..|............/...|.........*
 *....|....|.......|..../...|.....\___/...|___......../....|..___....*
 *....|....|.......|.../....|...../...\.......\....../.....|./...\...*
 *....|....|.......|../_____|__..|.....|.......|..../......|/.....\..*
 *....|.....\...../.........|....|.....|.......|.../........\...../..*
 *..__|__....\___/..........|.....\___/...\___/.../..........\___/...*
 *...................................................................*
 */
 
#include <bits/stdc++.h>
using namespace std;
#define int long long
#define INF 1000000000000000000
#define MOD 1000000007

const int N = 300005;
int spf[N];
vector<int> primes;
int idx[702];
bitset<125> t[4*N];
bitset<125> lazy[4*N][2];
bitset<125> all,none;
vector<int> euler;
int in[N],out[N];

void init()
{
    all.set();
    none.reset();
    for(int i=0;i<N;i++)
        spf[i]=i;
    for(int i=2;i<N;i++)
        if(spf[i]==i)
            for(int j=2*i;j<N;j+=i)
                if(spf[j]==j)
                    spf[j]=i;
    for(int i=2;i<=700;i++)
        if(spf[i]==i)
            primes.push_back(i);
    for(int i=0;i<primes.size();i++)
        idx[primes[i]]=i;
}

void dfs(vector<int> adj[],int u,int p)
{
    in[u]=euler.size();
    euler.push_back(u);
    for(auto v:adj[u])
        if(v!=p)
            dfs(adj,v,u);
    out[u]=euler.size()-1;
}

void build(int v,int tl,int tr,bitset<125> a[])
{
    lazy[v][0] = all;
    lazy[v][1] = none;
    if(tl==tr)
    {
        t[v] = a[tl];
        return;
    }
    int tm = (tl+tr)/2;
    build(v*2,tl,tm,a);
    build(v*2+1,tm+1,tr,a);
    t[v] = (t[v*2] & t[v*2+1]);
}

void push(int v) 
{
    t[v*2] &= lazy[v][0];
    lazy[v*2][0] &= lazy[v][0];
    lazy[v*2][1] &= lazy[v][0];
    t[v*2] |= lazy[v][1];
    lazy[v*2][0] |= lazy[v][1];
    lazy[v*2][1] |= lazy[v][1];
    t[v*2+1] &= lazy[v][0];
    lazy[v*2+1][0] &= lazy[v][0];
    lazy[v*2+1][1] &= lazy[v][0];
    t[v*2+1] |= lazy[v][1];
    lazy[v*2+1][0] |= lazy[v][1];
    lazy[v*2+1][1] |= lazy[v][1];
    lazy[v][0] = all;
    lazy[v][1] = none;
}

void bitwise_and_update(int v,int tl,int tr,int l,int r,bitset<125> a) 
{
    if(l>r) 
        return;
    if(l==tl && tr==r) 
    {
        t[v] &= a;
        lazy[v][0] &= a;
        lazy[v][1] &= a;
    }
    else
    {
        push(v);
        int tm=(tl+tr)/2;
        bitwise_and_update(v*2,tl,tm,l,min(r,tm),a);
        bitwise_and_update(v*2+1,tm+1,tr,max(l,tm+1),r,a);
        t[v] = (t[v*2] & t[v*2+1]);
    }
}

void bitwise_or_update(int v,int tl,int tr,int l,int r,bitset<125> a) 
{
    if(l>r) 
        return;
    if(l==tl && tr==r) 
    {
        t[v] |= a;
        lazy[v][0] |= a;
        lazy[v][1] |= a;
    }
    else
    {
        push(v);
        int tm=(tl+tr)/2;
        bitwise_or_update(v*2,tl,tm,l,min(r,tm),a);
        bitwise_or_update(v*2+1,tm+1,tr,max(l,tm+1),r,a);
        t[v] = (t[v*2] & t[v*2+1]);
    }
}

bitset<125> bitwise_and_query(int v,int tl,int tr,int l,int r) 
{
    if(l>r)
        return all;
    if(l==tl && tr==r)
        return t[v];
    push(v);
    int tm=(tl+tr)/2;
    return (bitwise_and_query(v*2,tl,tm,l,min(r,tm)) & 
            bitwise_and_query(v*2+1,tm+1,tr,max(l,tm+1),r));
}

void solve(int tc)
{
    int n,q;
    cin >> n >> q;
    int a[n];
    for(int i=0;i<n;i++)
        cin >> a[i];
    vector<int> adj[n];
    for(int i=0;i<n-1;i++)
    {
        int u,v;
        cin >> u >> v;
        u--,v--;
        adj[u].push_back(v);
        adj[v].push_back(u);
    }
    init();
    dfs(adj,0,0);
    bitset<125> b[n];
    for(int i=0;i<n;i++)
    {
        for(int j=0;j<125;j++)
        {
            if(a[i]%primes[j]==0)
                b[in[i]].set(j,1);
            else
                b[in[i]].set(j,0);
        }
    }
    build(1,0,n-1,b);
    while(q--)
    {
        int type;
        cin >> type;
        if(type==1)
        {
            int u,p;
            cin >> u >> p;
            u--;
            bitset<125> temp = none;
            temp.set(idx[p],1);
            bitwise_or_update(1,0,n-1,in[u],out[u],temp);
        }
        else if(type==2)
        {
            int u,p;
            cin >> u >> p;
            u--;
            bitset<125> temp = all;
            temp.set(idx[p],0);
            bitwise_and_update(1,0,n-1,in[u],out[u],temp);
        }
        else
        {
            int u;
            cin >> u;
            u--;
            bitset<125> temp = bitwise_and_query(1,0,n-1,in[u],out[u]);
            cout << temp.count() << '\n';
        }
    }
}

int32_t main()
{
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);
    cout.tie(NULL);
    int tc=1;
    // cin >> tc;
    for(int ttc=1;ttc<=tc;ttc++)
        solve(ttc);
    return 0;
}

Each update type consists of two bitmasks x and y.
What are x and y ?
I used a single bitmask and was travelling the entire lazy bitmask of size 125 each time in the push operation and got TLE because it runs in O(125*(n + q*log(n))). How do you avoid this using just bit operations ? I understood the part that each update query has to either set or unset ith bit of all elements in [L, R].

Hey, bitsets have a constant factor of at least 1/32 or 1/64 (cuz internally, each 4 bytes of array can be turned to a single integer and then it’s a matter of doing integer1 & integer2 in O(1))

Yeah, I am aware of that. I think you misunderstood my question. What I am asking is what exactly are the bitwise operations doing in the editorial that are helping in not travelling the entire lazy array ? I want to understand that part and prove it for all the 3 × 3 cases, whatever they are (I suppose 3 cases for x per case of y).

Oh, I suppose I should’ve provided some intuition for that.

As mentioned in the editorial, the true update state is ternary for each bit (keep it the same, set it to 1, or set it to 0).
Representing three states requires two bits of information, which is what x and y are.
In particular, for a fixed bit:

  • x = 1 and y = 0 means “keep this bit the same” (notice that this is the default state as well)
  • x = 1 and y = 1 means “set this bit to 1
  • x = 0 and y = 0 means “set this bit to 0
  • x = 0 and y = 1 would also mean “set this bit to 1”, but also isn’t a state we reach naturally in this problem and so can be ignored.

If you have a bit b and update (x, y) the “apply update” step is b \to (b \ \& \ x) \mid y, and it’s easy to verify that the three relevant cases above are correct for it (i.e they operate on b as claimed).

The weird-looking “perform bitwise AND before bitwise OR” combine step keeps these meanings the same — you can try all 3\times 3 possibilities of combining updates and see that they end up as expected too.

How would this be solved if you ALSO had Type 4-6 which is exactly type 1-3 but on paths.
I’m asking if HLD can be combined with Euler tour in terms of updates.

It’s possible, yes.
See here for example.

Thanks for the clarification. I have finally proved it for all the 3×3 cases and got AC. This is a really good segment tree problem and a good editorial. The bitwise updates seem to be a very hard observation to come up with and I find it very non intuitive. Or do I have skill issue and one is supposed to come up with them on their own ? Are there some other good problems like this that I can practice?

Nice