PROBLEM LINK:
Practice
Contest: Division 1
Contest: Division 2
Setter: Aleksa Plavsic
Tester: Hussain
Editorialist: Taranpreet Singh
DIFFICULTY:
Medium
PREREQUISITES:
Precomputation, Bitmask, Dynamic Programming, Combinatorics.
PROBLEM:
Given a tree of N nodes, Each node can have either 0 or 1 coin. Find number of ways to select nodes to put coins such that GCD(X, Y), where X, Y represent number of coins in subtree of any two disjoint subtrees.
QUICK EXPLANATION (maybe quick but cannot be complete)

BitMasking on prime numbers up to $N/2$ is done for building DP states.

Dynamic Programming state is represented as ($u$, $X$, $M$), representing Number of possible ways to place $X$ coins in subtree of node u, such that mask all the subtrees are divisible by primes represented by mask $M$.

DP state is calculated from leaf to root, Every combination of mask is checked and number of ways are added accordingly (involving a number of tricks as explained below).

Final answer is represented as sum of all possible combinations of Number of coins and bitmasks corresponding to root.
EXPLANATION
This is going to be a long editorial, so, be prepared. Here we go!!
First of all, to handle GCD constraint, we need to keep information of all the primes, which have been used in some subtree. We can keep this information in terms of primes which divide any of the subtree.
There are 19 primes between 1 to 70, but we ask ourselves, do we need all this information?? Where would we use the information that a subtree is divisible by 37, when there cannot exist another disjoint subtree having size divisible by 37, as it would require atleast 75 nodes in tree (not 74).
So, we need only primes upto N/2, and there are 11 primes. Let’s create a bitmask over all these primes.
If ith bit of mask is on, it implies that current subtree or subtree of any descendent of current node has number of coins divisible by ith prime.
Now, If i say a mask of subtree of node, then it should be understood that it means that subtree of node, or its descentants’ subtrees are divisible only by primes in this mask. This definition would be used in editorial.
Now, move towards Dynamic Programming.
I’ll Discuss the slow approach first and then woudld see how to optimize it to fit within Time Limit. Time complexity analysis has been given at the end.
Let us define two 3D lookup tables, ANS[i][j][mask] and DP[i][j][mask], defined as follows.
ANS[X][j][mask] denote number of ways to put excatly j coins in subtree of X including X, such that the mask of subtree of is mask. This table stores final answer for every combination of (Node, coinCount, mask).
If node X is leaf, we can either put the coin in subtree or keep it empty. But ways are represented as ANS[X][0][0] and ANS[X][7][0] respectively. Assign these two to 1 and rest will remain 0. (0 or 1 cannot be divisible by any prime).
If node X is a nonleaf node, we use Dynamic programming to find out number of ways to put j coins in subtree of X excluding X. The DP state can be represented as
DP[i][j][mask] is defined for a node X, the Number of ways to put j coins in subtree of First i children of X such that the mask of nodes is mask.
DP[0][0][0] = 1 Because we can put 0 coins in first 0 children of node X such that their mask is 0. i.e. BitWise_AND(mask1, mask2) == 0
Transition in DP table would be represented by moving from ith child to (i+1)th child as follows.
DP[i+1][coins+add][mask1mask2] += DP[i][coins][mask1]*ANS[child][add][mask2] if mask1 and mask2 do not share any prime.
For all combination of (i, coins, mask1, add, mask2) for each child of X. i is just Child number in 1based indexing and coins is sum of subtree sizes of previous i children.
Using this DP, we will get final states in DP[outDegree(X)][j][mask] representing Number of ways to put j coins in subtree of X exculding X such that their mask is mask.
After above DP, we need to update ANS table with answer for node X for every mask, handling cases whether Coin is put at node X or not separately by iterating over each combination of (number of coins, mask) covered in DP[outDegreee(X)] and updating answer and mask carefully.
This solution will get TLE result. Time complexity of this solution has been discussed at the end.
Optimizing above solution
First of all, if, For any combination of (i, coins, mask1) if DP[i][coins][mask1] is zero, we can directly skip these without iterating over (add, mask2)
Second thing, In case we have a combination where add is divisible by any prime with bit 1 in mask1, it would definitely voilate the GCD constraint, so we have to skip it.
The most important thing is, the BitWise_AND(mask1, mask2) == 0 constraint. At present we are iterating over 2^{11} * 2^{11} combinations, which dominates the solution complexity and is the main cause of TLE.
Can we think about any way to directly iterate over only such combinations satisfying above constraint so that to reduce above time.
Thankfully, there exist a method by means of which we can directly iterate over submasks of a mask. We can utilize the fact that all values of mask2 satisfying BitWise_AND(mask1, mask2) == 0 are submasks of XOR(2^{11}, mask1). This results in O(N^2 * 3^{11}) time complexity which shall easily pass within time limit.
Click to view
Let mask1 and mask2 be the two given masks.Above technique is applied as
supermask = ((2^11)1)^mask1
for(int mask2 = supermask; true; mask2 = (mask21)&supermask){
//Do your work with current mask2
if(mask2==0)break;
}
Though following trick isn’t necessary, if can come in handy if memory is a constraint and occurs quite frequently in Dynamic Programming problems.
In above case, for DP table ith transition ith, we need information of only (i1)th layer, so, we can reduce memory consumption of DP table by a factor of N.
Time Complexity Analysis:
For First Solution, For every node, we iterate over every combination of masks, taking 2^{11} * 2^{11} = 4^{11}. We repeat this procedure for OutDegree(i) times pref[i1, seemingly an O(N^2) structure, implying around O(n^2*4^{11}) iterations per node, but the summations of Out degree of all nodes is bounded by N, so overall time complexity is amortized to O(N^2*4^{11})
For optimized solution, For Every mask having j bits 1, we iterate only over the masks which have those j bits off. Specifically, for every mask with j bits on, we iterate only over 2^{(11j)} masks in worst case. We have exactly C(11, j) such masks.
So, total complexity of iterating over masks become \sum_{i=0}^{11} C(11, j) * 2^j which is just the binomial expansion of (1+2)^{11} = 3^{11}.
The Final Time complexity becomes O(N^2 * 3^{11}).
I know that the editorial is long and problem is compliacated, so feel free to ask any doubts, queries or point out any mistakes.
AUTHOR’S AND TESTER’S SOLUTIONS:
Editorialist’s solution (Commented Code)
Until the Codechef admins link solutions here, Refer the commented Solution here.
Feel free to Share your approach, If it differs. Suggestions are always welcomed.