please share your approach to CPCOMP problem of OCT long challenge

pk301 · October 20, 2018, 6:56pm

Coprime Components. Please share your idea on how to solve this task for 100 points.

ducpro · October 20, 2018, 8:06pm

Got stuck too. I would love to know the idea behind it. Solved it by just connecting the 200 numbers with the largest degree to other numbers. Man, the tests are weak as hell.

lboris · October 20, 2018, 9:32pm

Pre-calculate all prime divisor for each number. (sieve)
For each number X get its simplest form (i.e. p1 * p2 * p3 instead of p1^a1 * p2^a2 * p3^a3) because for the current problem these are the same.
Remember how many of the same numbers are (after translation) (There can be maximum ~ 121K of such numbers)
Initialise DSU (disjoint union set) structures.
Break early if the array contains number “1” - as all other numbers will be just joined together into single group.
Allocate as required bit-set array for each prime (sufficient length to cover 121K bits) there are maximum <20K such primes so total memory ~ 256M max. The structure contains up to 121K/64 ~ 1900 long long ints. For each bit-set set i-th bit if the i-th number contains this prime.
Sort the vector of the numbers with less primes first , then sort by primes themselves.
Now run cycle from 0 to Nx(the new N which is <= original N) and for each number - get primes that are in this number and on the fly do OR of the bit-sets for these primes.
Run it for range i+1,Nx, but obviously /64 as each long long int is 64 bit
Skip any if the current qword is all 1s, If not - run bit mask to check which number is free of primes from x-ith. Then just do union of x and y
Break early if total number of unions is 1.
At the end count number of unions and for each union which is size 1 and not united to anything add its size (number of these numbers) minus one to the results. I.e. if there are 3 identical primes in the original array and they are not joined - then we have 2 more free groups to add. Report the final result.

Tons of opportunities for optimisation, but actually not required as the max time ~ 0.15 seconds.

I really liked the problem. Thanks to the author.

shoom · October 20, 2018, 9:49pm

On top of 30 point solution I explicitly checked if the sequence contains at least one prime number greater than one half of the largest number in the sequence. If there is such number then there is only one connected component. It helped to get 100 points. Not sure if this is by design or just because test cases weren’t strong enough.

For 30-point solution I converted each number in the sequence into the number with square free factorization. It helped to reduce the sequence size a bit. To loop through neighbors in DFS faster, I used prefix tree based on prime factorization.

furious · October 20, 2018, 10:06pm

If there is atleast 1 prime number in the input array , i will pick one prime number and also i will find all the numbers whose gcd with our current prime number is 1 , at the same time i will erase those numbers from the multiset (initially i inserted all the input elements into it) ! Now they will be in a single component , for the rest of the elements , they wont form components among themselves (because , they all are having our picked prime number as their prime factor ) ! Either they will join the earlier component (because there can be some integer in the formed component whose gcd with our current element is 1) or they will form a separate component each of size 1 ! This can be done by inclusion-exclusion in O(64) time (I mean , checking whether there is an element in the initial component whose gcd with our current element is 1) ! If there are no primes in the input array , i will just find the components using BFS !
Feel free to ask if you have any doubts !

algmyr · October 20, 2018, 11:18pm

Regarding the 121k in @lboris answer, and to answer @aryanc. You can easily verify this for yourself by just writing a program like

# O(n) prime sieve, computes least prime factor
lim = 200000
lp = [0]*(lim+1)
pr = []

for i in range(2,lim+1):
    if lp[i] == 0:
        lp[i] = i
        pr.append(i)
    for p in pr:
        if p>lp[i] or i*p > lim:
            break
        lp[i*p] = p

def factor(n):
    while n>1:
        yield lp[n]
        n //= lp[n]

def prod(A):
    p = 1
    for a in A: p *= a
    return p

n = 2*10**5
A = range(1,n+1)

# reduce to radical, i.e. remove prime multiplicities
A = set(prod(set(factor(a))) for a in A)
print(len(A))

which will print 121581.

wmoise · October 20, 2018, 11:55pm

You can reduce the number of primes this way. First, you need to use an O(n log log n) prime sieve.

If 1 is in the set, then there is one component.
Record the number of instances, n, of each number x. If in the end, the number is in a component of size one, then there are n components. If the number is in component of size > 1, then all numbers in the component contribute to a single component (the number of instances don’t matter).
Reduce multiple powers of a prime to one.
If a prime is either unique to particular number or functionally dependent on another prime (a prime always occurs with another prime), you can erase that prime from all the numbers. That is, remove the prime from each number that is a multiple of that prime and add all the instances of that number to the new number without the prime factor. (Be careful if two primes always co-occur; get rid of one)

2a) After erasing the prime p from number x, the new number becomes x’=x/p without the prime. If x’ is the only number left, then the number of instances of x’ is the final answer.

2b) If x’ becomes 1, then all the other numbers are in in the same component as x’ and the answer is 1.

Find all numbers that contain just one prime and toss them into a hashset. All these primes together belong to the same component C.

3a) If the number of primes is in the hashset is greater than 7, the final answer is one component.

3b) If the product of all those primes is greater the largest number in whole set, then the final answer is one component, because each number from the whole set will be missing at least one prime from the hashset.

3c) Go through all the other numbers. If that number does not contain all the primes in the hashset, then add that number to the component C.

3d) Later in the search phase when we compare numbers together, we can ignore all the primes in hashset, because we tested them already. You can also ignore any number that contained at least one prime from the hashset, because all the other numbers not in component C will have contained every prime from the hashset as a factor and there not be coprime.

After this, the number of distinct numbers has fallen sharply. We also eliminated a large number of candidates to pair off from step 3d. This is important because, if the test cases weren’t weak, there would be O(n^2) number of pairs that are coprime. So, testing neighbors would TLE even if using a prefix tree solution to prune neighbors. The number of valid neighbors would still be O(n^2).

We can take advantage of another optimization. The number of components–ignoring multiple instances of the same numbers–will be very small.

Then, you can build a disjoint set (union-find data structure). Although, I didn’t brute force, I suspect it would not TLE. The number of values to compare will be a tiny fraction of the original set. You can compare each number to a small group (~20) of numbers containing the least frequent primes. You can also AC by comparing each number to 20 randoms numbers like I did after grouping numbers by their lowest prime factor and not comparing those within the same group.

fjzzq2002 · October 21, 2018, 10:11am

Btw, the test data seems pretty weak. Some accepted solution will receive TLE or WA for the following data:

vector<int> ans;
void gen(int a,int b,int c,int d)
{
    for(int w=0;w<=200000;w+=a*b)
    {
        if(w%c==0||w%d==0) continue;
        ans.push_back(w);
    }
}
int main()
{
    gen(2,3,5,7);
    gen(5,7,2,3);
    gen(3,7,2,5);
    gen(2,5,3,7);
    gen(2,7,3,5);
    gen(3,5,2,7);
    printf("%d\n",int(ans.size()));
    for(auto x:ans) printf("%d ",x);
    puts("");
}

reedef · October 21, 2018, 9:05pm

Here is a solution with running time O(n log^4 n). It got tle but I believe with some optimizations it can pass the testcases. I will purposefully ignore off by one error in this solution:

Run a sieve to calculate, for each number x, its radical (x but without repeated primes), mu(x), which is the möbius funcion, and then compute all the divisors of x.
Replace each number by its radical
Lets define the following procedure p: given a set x of numbers, p(x) will find, for each squarefree integer < 2e5, a number in x coprime to it, or -1 if such number doesnt exist. This procedure will run in O(n log^2 n) time:
for each squarefree number m, make an array of size 2e5/m that stores, at index i, the number of multiples of m in x less than or equal to im. This can be done with prefix sums and because the harmonic sum is ~log n, this takes n log n time
For each number y, the amount of coprimes to y less than some bound t is the sum over all of its divisors d of mu[d] × arr[d][t/d] (see answer to this stackoverflow question for more details) , where arr[d] is the array we built at step 4. We can use this to find with a binary search, the smallest integer in x coprime with y in O(log n × amount of divisors of y). Since the sum of the number of divisors is n log n, this can be done for every number x in n log^2 n time. Thus we finished describing procedure p
Now lets do the following: we will do a number of steps in which every conected component which isnt yet complete will merge with another component. Its easy to se there will be at most log n steps until all conected components are complete.
In each step, we will split the conected components in two groups many times, such that any two components will appear in different groups at least once. This can easily be done in log n splits by iterating 0 <= e < log n and putting all conected components whose label has the e-th bit set into one group and the rest in the other
For each of these splits, we can use procedure p to find all conctions from every number to a number in goup 1, and then do the same to find all connections from all the numbers to a number in group 2, and use the union find datastructure to join the components connected by these connections
If we do that, at the end the union find datasrructure will have all the components. Care has to be taken to deal with equal numbers in the input, but that has already been handled in other comments

pluristiq · October 22, 2018, 11:25pm

@lboris why do you sort by number of primes, then by the primes themselves?

hars123 · October 23, 2018, 1:08pm

@furious__ can you please explain the part where there are no primes in the input array ,how you have managed bfs there.

lboris · October 24, 2018, 11:30pm

To @pluristiq and @pk301 in regard to sorting:

Your solution in Python is slightly wrong, instead of creating the mask as array of integers (or longs) you create it in python as one big number, that means that every time you do operation on entire set (effectively falling back to O(N^2)/2 ) , you also in the solution create the mask upfront, which is wrong again, you don’t need this extra run through the array. The comment space is too small, take a look at the solution https://www.codechef.com/viewsolution/20592710 specially understand the line: "if ((~bs) != 0) { " - that what makes it to skip most part…

In your python solution make mask a list of integer (32 bit each) rather then single big-integer. In this case you will see that most of the integers in this array are subject to skipping (otherwise you would very quickly finish with single group), the sorting helps to order it in such way.

xrisk · October 20, 2018, 7:32pm

Just wait… SnackDown 1A has a problem similar to it.

pk301 · October 20, 2018, 8:10pm

can you please elaborate

furious · October 20, 2018, 10:43pm

ohh ! I had hard time with this question bro , felt that the tc are really weird !

shoom · October 20, 2018, 11:02pm

Yes, I also separately checked for the existence of 1 in a sequence and whether all numbers are same (and not equal to 1). Otherwise when there are no primes > max/2, DFS based solution didn’t TLE. I tested using a sequence of all numbers 1 to 200,000 excluding all primes greater than 100,000.

algmyr · October 20, 2018, 11:19pm

@aryanc403 See my answer please share your approach to CPCOMP problem of OCT long challenge - #6 by algmyr - general - CodeChef Discuss

lboris · October 21, 2018, 1:07am

@algmyr - thanks for the nice clarification with example, just saw the comment, but you already answered it. @aryanc403 , I hope it is clear now. And yes, probably the most complex case could be constructed (without 1 and any primes > 100K or even 60K ), I’m also agree that the test cases were a bit weak, specially initially, and then later adjusted a bit.

algmyr · October 21, 2018, 1:38am

The sets you’re describing (triplets) will not be large at all. You can only continue until the product exceeds 2 \times 10^5. This means the smallest factor can’t be larger than around 60.

algmyr · October 21, 2018, 1:43am

One observation that severely limits things is that you can’t have too many unique primes among your numbers. If you have > 6 unique primes in your list you’re already forcing numbers > 2\times10^5 for things not to be connected.