Fast Fourier Transform, Graph Theory
You are given a tree with N nodes and exactly N-1 edges.
Find the number of pairs of nodes in the tree for which the length of the path (number of arcs between the nodes) between them is prime.
The problem actually asks for the probability that selecting two such nodes leads to a prime length path. But it is obvious that what is needed is the number of such pairs. The answer will just be
number-of-pairs / NC2
The explanation of the solution derives ideas largely from the Tester's Solution.
I have experience with setting / testing very few problems using this central idea in various contests (3 of them on CodeChef) and always, the test data seems to have been cracked by a different - no, I don't mean simpler, just different - approaches.
I say cracked, because these approaches might fail particular corner cases, which were not thought off during testing. I believe that generating the test data that would fail every cracky approach is very hard. Sometimes, an approach might not be possible to fail within the constraints.
This is probably the case in August Challenge. It would be interesting to know whether it was possible to fail the alternate approaches, or not!
ignore the prime numbers
The problem asks you to find the number of pairs at a distance of prime numbers. But, there are no elegant properties that you can use to be able to simplify the problem. Our objective will remain simpler.
In the end, we will only add up the number of nodes at prime distances
divide and conquer
Now, for each child c of v, we know how many decendents are at each distance. We can update the values to be stored at v by adding 1 to the values from child c. Of course, values at v will be composed using all the values from all the children of v.
This takes care of all the paths that start at a node and always move downwards, towards the decendents.
For all the other paths, we can do a convolution of the arrays of values stored at each child of v. Of course, the fastest way to convolve is using Fast Fourier Transforms for polynomial multiplication. This idea has been explored in the July Challenge problem FARASA as well.
The given tree is not necessarily a binary tree. If we try to find a split of nodes in such a tree, we may end up with a split that selects only O(N / D) vertices in one set; here D is the largest degree of any node.
This would mean that our divide and conquer may perform too many FFTs in the worst case. This is highly undesirable.
If a node has more than 2 chilren
In the final tree, the meaning of distance will be replaced by the sum of the costs of edges, instead of the number of edges.
Now, we can always find a split that selects at least O(N / 3) nodes in one set.
The complexity of the solution is O(N log2 N).
But, the constants for this solution are large. The Tester's and Setter's solutions based on the described approach are still slower than several solutions submitted during the contest.
Can be found here.
Can be found here.
This question is marked "community wiki".
asked 13 Aug '13, 01:56
I did it using DFS and flattening of tree, with complexity of N*P, where P is the number of primes till N.
Here is the explanation to my algorithm:
Devising the algo:
Looking at the data, we see that N is a maximum of 50,000. Also, Let us denote the number of primes up till N with P. P can be a maximum of 5133. My algorithm is such that it will visit each of the N nodes a maximum of P times. If the tree has branches, the maximum depth or the levels that the tree will have, will be less than N. Otherwise, if tree is without branches, it will have N levels. So, on an average, the algo has complexity less than N*P, as worst case isn't really attained.
First of all, I take an array PRM, which stores primes up till 50,000. Let us include 1 in the list as well. So, PRM1=1. PRM=2. PRM=3. PRM=5.....PRM=49999. Next, I take an array NOP, in which NOP[i] stores the index of that particular prime which is greater than or equal to i. So, NOP=4 as the prime just greater than 4 is 5, and 5 is indexed at 4 in the array PRM.
Now let's take the following tree:
My algorithm roots the tree at node 1. Next I do a recursive DFS till I reach the leaves of the tree, i.e., 9, 6, 7, 8, 4. So, now I take a bottom up approach. For the leaves, I just take a new array(vector) of length 1 and put the value 1 in it. Then I return this array. For the other nodes, say node 5, my DFS call to node 9 has returned this me an array of length 1. So I extend the array to length of 2, and add the value 1 to it. So now the array contains: 1 1. I added 1 because 5 is a single node at this level. Now I reach node 2. My recursive call to node 5 returned me an array 1 1, call to 6 and 7 returned arrays of unit lengths 1 containing 1. I take the longest array, i.e., the one I got from call to 5, and extend it by 1 unit and add a value 1 to it. So the array looks like 1 1 1. Now I take the other array returned from call to 6 and for each element in this array, I check how many nodes are at a prime distance from it in the array 1 1 1. You can visualize it as if I have placed the two arrays side by size to look like this: 1 1 1 - 1, where the '-' demarcates two arrays. So, the 2nd '1' in the first array is at a horizontal distance 2 from it. Thus, I increase my count of total nodes at a prime distance by 1. Next I join the two arrays and the final array looks like 1 2 1. The second element became 2 because node 5, 6 are at the same level. So I merged the two branches. Now I take the third branch and place it horizontally with the previous array. It now can be visualized as: 1 2 1 - 1. So two node are at a horizontal distance from the only element in 2nd array. So I add 2 to my count. I merge the two branches to get 1 3 1. I have no more branches so I return this array.
This is how my algorithm calculates number of pairs at prime distance. The purpose of storing NOP array is that when I am checking for horizontal distance say in such a case, where first array is 1 3 1 2 1 and second is 1 1 1 1 2. So, when I reach element 4 of second array, I only need to check for number of nodes at distance more than 4. Hence, I need to know the prime just larger than 4. This saves several operations.
Hope that helped.
This answer is marked "community wiki".
answered 13 Aug '13, 04:41
Hi, great question indeed! I also solved it using fft, but can you please describe the other method?
answered 13 Aug '13, 02:44
hey can anyone give a link to a good tutorial on fft..
answered 13 Aug '13, 13:11
In this constraints it can be solved in O(n^2). But there are two tricks, first - when we merge subrees, we should merge small subtree to big one to avoid MLE. second - we compute these values in reversed order, so we can update subtree by adding 0 to the end of vector. The worst case for my solution is a chain, and root is in the middle of it. But it's complexity can be decreased to O(nk), where k is the number of intersting numbers. http://www.codechef.com/viewsolution/2454539
answered 13 Aug '13, 17:03
I also used FFT to convolve subtrees. The important thing in my code is not to use an arbitrary node as root, but use a root which minimizes the largest subtree (such a node can be found in O(n)). There always exists a node such that the largest subtree has at most ceil(n/2) nodes. This way, one can guarantee O(n log2 n) time.
answered 15 Aug '13, 02:01
dfs approach seems much easier than FFT........
answered 20 Aug '13, 00:14
Can anyone give a link to FFT tutorial, in english,short and clear ?
answered 05 Sep '13, 13:49