CHAORNOT - Editorial

pragrame · June 19, 2013, 6:15am

Problem Link:

Practice

Contest

Difficulty:

Challenge

Pre-requisites:

local-search

Problem:

Given a set of numbers, find a subset of maximum size that does not contain an arithmetic progression.

Explanation:

Let us consider maximal AP-free subsets. By maximal, we mean that adding any more element into the set would cause it to form an AP. We would now like to perform a local search over such sets.

In order to define a local move, we should have a good neighborhood relation. For this, we define a “ready” number as one which is not in the set, but when added to it would cause only one AP to form. With this in mind, a natural local move would involve adding a ready number, removing one of the two other numbers that formed the AP, and then trying to make the set maximal.

The next thing to look out for, is to try to incorporate a metric on our local search - some way of knowing we’re doing better. Here, from a set of size M, we will always move to sets of size M, and then try to increase their size. So merely having the size of solution as the metric is vague. Setter uses Number of ready numbers as the metric, so that we have more options to move to a better solution.

Finally, use simulated annealing with the above metric.

In order to find a good initial solution, setter used the following approach.
Choose a base B, and write all the numbers in base B - lets say they have k digits.
Now, let each digit be a dimension in space, and then write map the numbers onto “Spheres” of that space - here, consider only numbers whose all digits are <= B/2. Then, taking all numbers of any sphere will be a valid solution, since the sum of any two such numbers, divided by two, will be a point outside this sphere.

This solution gave Setter a score of ~5.2 - while the contest’s best was ~5.8.

Setter’s Solution:

Will be updated soon.

Tester’s Solution:

Can be found here

skbly7 · June 19, 2013, 4:15pm

see my submission once : CodeChef: Practical coding for everyone
I was getting correct answer and it got AC too but i dont know why i got 0.69 points only.
I am new to this scoring system, please help.
if possible can i get a test case for which my program is getting lesser number of output.

cyberax · June 20, 2013, 1:03am

@kutengine: please come to save us all !

mugurelionut · June 20, 2013, 4:27am

I can explain the general idea behind my solution and @evgentu’s solution, after reading his code (the solutions with the 3rd and 2nd best score):

Find a good solution using a greedy approach
Improve the solution using randomization as long as time permits

In my solution:

I started with the whole set of numbers and repeatedly removed the number which is part of the largest number of arithmetic progressions until we get a set of numbers without arithmetic progressions. Then, I tried to add back to the set the excluded numbers in ascending order (if they did not produce any arithmetic progressions). After optimizing and tuning this approach I could get a score of about 5.26 (but it seemed very difficult to advance further this way). Note that when the number of elements N is large I couldn’t use this solution directly, as its time complexity is O(N^2) - instead, I “approximated” it somehow via repeated random sampling and greedy filtering. Moreover, I also initially filtered out many elements from the beginning which I noticed were never selected by this greedy solution.
While time permits: remove a certain number of randomly chosen elements from the best solution. Try to add the other elements to the solution in ascending order as long as they do not generate arithmetic progressions. Try to add back the removed elements (in case some of them can be added back). If the new solution has more elements, then keep it as the best solution ; otherwise just ignore it. I could get all the way to 5.6577 with this randomized improvement approach.

In @evgentu’s solution (based on my understanding after reading his code):

He tries to add the numbers to a solution considering multiple orderings (he splits them according to some blocks, etc.). It should be clear that the optimal solution can be found this way as long as you add the numbers to the solution in the correct order. A number is added to the solution only if it does not generate arithmetic progressions with the previously added elements.
He starts from the best solution found this way and improves it via randomization. He removes a certain number of elements from the solution. Then he tries to add the elements which were “freed” by the removed elements to the solution, plus the removed elements (as far as I can tell). Thus, his solution is at all times a maximal solution (just like mine). The best solution found so far is the answer. A difference between his approach and mine is that he continues from the newly obtained solution even if it is not better than the best solution found so far (while I discard it in this case and I always try to improve the best solution found so far, rather than the current solution).

In case I made any mistakes explaining @evgentu’s solution, I invite him to correct me.

As far as I can tell, @kutengine’s solution is a combination of:

recursively computing (and then combining) solutions for sets of numbers with the same value modulo 3.
when the number of elements left is not very large he seems to try a combination of greedy with randomization (for instance, his large_first_del function seems to repeatedly remove the number which is part of the largest number of arithmetic progressions, choosing randomly in case there are multiple numbers which are part of the same largest number of arithmetic progressions).

I would also like to ask @kutengine for a more detailed explanation of his solution.

kuruma · June 20, 2013, 4:58am

Hello @mugurelionut, I just wanted to take some time to say that I truly love reading your posts and you explain your hard insights very clearly so thank you for your great explanations (you have a fan here )

I would also like to know if the concept of DFT could be used here, as it seems to be used on this problem, COUNTARI.

Obviously, on this problem the idea was to count the number of AP and not the opposite, but I am wondering if with some sort of “Inclusion-Exclusion” principle this technique could be used…

I haven’t solved the problem although that was the only thing it came to my mind, even if I didnt know how to implement it properly…

Best regards,

Bruno

sumanth232 · June 19, 2013, 11:16am

pls provide any other links which explain this approach more explicitly

sumanth3232 · June 19, 2013, 11:19am

very nice explanation

kcahdog · June 20, 2013, 12:10am

@skbly in this scoring system the best solution gets 1 pt and the rest the fraction of the performance of their solution compared to the best.if best solution gives 10k o/p and u generate 7k, then u will get around 0.7 points. As input range is upto 100000 your code must be generating lesser no of o/p’s in these ranges. So the only thing you can do is compare the o/p of your program and some good coders program at higher values

robertking · June 20, 2013, 3:10am

I found the explanation clear, concise and informative. There is a link provided for simulated annealing. For a more basic approach which gets a lower score, you could check this link: http://discuss.codechef.com/questions/13864/discussion-of-charornot?page=1#13984

samjay · June 20, 2013, 5:06am

@mugurelionut How long does your first step take? Maybe I should take a look at it. I didn’t use random this time, since I had some bad experience with it last month =), I wonder if my solution for the 2nd step would improve it.

mugurelionut · June 20, 2013, 10:56am

@kuruma: It seems that DFT can indeed be used here in order to count at least some part of the arithmetic progressions. From what I understood from COUNTARI’s editorial, the DFT solution can be used for computing in how many 3-term arithmetic progressions we have x as the middle element (for each number x). However, the greedy algorithms I mentioned need to also count the cases when x is the smallest and the largest element in the progression. Anyway, perhaps for large values of N this initial counting could have been useful somehow (if you could find a good algorithm to use it).

kuruma · June 20, 2013, 2:33pm

I see Thank you very much for your explanation!

fmm · June 21, 2013, 9:16am

Yep, I’ve used the DFT to calculate the frequency of the middle number, then after sort these numbers by this frequency I did insertions greedily and plus some more ideas (divide the number in modular groups, test some orders)… I got 5.0 as best score.