LPARTY - Editorial

adurysk · April 13, 2015, 4:06pm

PROBLEM LINK:

[Practice][111]
[Contest][222]

Author: Pavel Sheftelevich
Tester: Sergey Kulik
Editorialist: Adury Surya Kiran

DIFFICULTY:

HARD

PREREQUISITES:

Maths related to functions, Backtracking, Bitmasks

PROBLEM:

You are given M sets with N variables(N <= 5) where each variable can be either true or false and you want to find some key subsets with minimal total length that cover all given sets and exactly them (no extra sets). All the given sets contain all the variables.

EXPLANATION:

Let’s define some terms first:

Argument:

An argument is a set of boolean inputs to a function.

Example: If there is a function with 3 variables namely (X, Y, Z) then (X = 1, Y = 0, Z = 1), (X = 0, Y = 1, Z = 1) are two examples of arguments.

Conjunction:

A conjunction is LOGICAL AND of several variables Xi or not Xi.

Example: If there is a function with 3 variables namely (X, Y, Z) then (X AND Y AND Z), (X AND Y), (X AND (not Y) AND (not Z)) are three examples of conjunctions.

Function:

A Function is a mapping from a set of inputs to a set of outputs. In another way it can be stated as a LOGICAL OR of a set of conjunctions. The set of outputs is {0, 1} and set of inputs contains the N people in our problem.

Example: ((X AND Y) OR (Y AND Z)), this function takes value 1 for the following arguments

(X = 1, Y = 1, Z = 0)
(X = 1, Y = 1, Z = 1)
(X = 0, Y = 1, Z = 1)

Now lets reformulate our problem in another words: You have a boolean function F on N variables and you are given M arguments where function takes the value 1 and on all other arguments it takes 0. We can see that each set(or key set) is some conjunction on N variables.

Let’s denote Xi with upper case letter and not Xi with lower case letter (like in the problem statement). You are to find a function equivalent to initial function, which is OR of several conjunctions with minimal total number of variables contained.

The most straightforward approach:

Generate all possible conjunctions. Each conjunction covers some arguments of the initial function. So the problem boil downs to minimal set cover problem with weighted covering subsets. The number of conjunctions on N variables is 3^N. So for N = 5, there are 243 conjunctions, and we have to check all 2^243 possible sets of conjunctions, which is impossible to pass in acceptable time even with best optimizations.

Here one of the key observations goes: many conjunctions are redundant - they are included in others, can be obtained as combination of some other conjunctions and some even doesn’t fit our function(they take the value 1 for arguments which F doesn’t have).

An elementary conjunction is called an implicant if F takes the value 1 on all the arguments which the conjunction takes the value 1.

So we want find group of implicants of more acceptable size than 243 and work only with them.
Let’s find them using the following algorithm:

Put all sets(arguments) where F takes 1 and while possible do the following

If there are implicants A, B and if the set of arguments covered by A is subset of set of arguments covered B then get rid of A.
If there are implicants A, B with form: A = xC and B = XC then A, B can be replaced by one implicant C.

After some iterations we will get group of implicants(prime implicants). The universal upper-bound of the number of prime implicants is 3^N/N. For N = 5 upper bound of prime implicants is 3^N/N = 48 - much better, but it turns out that maximal number of prime implicants for boolean function on 5 variables is 32(author checked all possible functions on 5 variables: 2^32).

So now we have cover set problem with 32 available sets. It’s well-known NP problem, but in our case we can make one more observation: the covering sets we have are a bit specific, more precisely, if we have many prime implicants then most of them will cover few arguments and reversely, if each implicant covers many arguments then there are few implicants.

That gives us an idea that recursive brute-force solution(with O(2^N)) can perform well if we use some optimizations.

Let us say that we have K prime implicants finally. Now lets have 3 arrays cost[1…K], covers[1…K], or_to_end[1…K].
cost[i] stores the length of i’th implicant, covers[i] is a bit mask which stores the set of arguments which evaluate i’th implicant to 1, or_to_end[i] stores bitwise OR of all covers[j] such that i <= j <= K.

Now take a look at the following recursive code for further implementations:

void backtrack (int pos, long long current_mask, int current_cost) {
    if (current_cost >= ans) // there is no meaning in going further because we already crossed our current answer
        return;
	if (current_mask | or_to_end[pos]) != required) // As we cannot cover all the arguments even if we take all the implicants from this position
		return;
    if (current_mask == required) {
        ret = current_cost; // We found a better solution
        return ;
    }
	if ((current_mask | covers[pos]) != current_mask) // if current implicant covers atleast one argument which was already not covered
		backtrack(pos + 1, current_mask | covers[pos], current_cost + cost[pos]); // then take this implicant and go to next one
    backtrack(pos + 1, current_mask, current_cost); // Go to next implicant without considering the current one
}

In the above psuedo code there are 3 key optimizations, all of them are simple if-break clause. With them recursive solution works very fast(several millions recursive calls in the worst case instead of 2**32) and without any of them solution becomes slower 3-20 times slower. Important moment of the solution is effective implementation, solution should use bit-masks(two ints for false variables and true variables in the authors’s solution) and optimizations in recursion. Without them it’s unlikely to pass.

One more major optimization we can do is to sort all the K prime implicants in the increasing order of their costs before calling the backtrack function, then we converge on the answer a lot faster in most of the cases. This is because of a small heuristic, that the smaller prime implicants cover more arguments, and also have less cost.

AUTHOR’S AND TESTER’S SOLUTIONS:

Author’s solution
Tester’s solution
[111]: LPARTY Problem - CodeChef
[222]: CodeChef: Practical coding for everyone

lebron · April 14, 2015, 5:19pm

And a bit different approach

Dynamic programming [3^n][2^(2^n)] is obvious - minimum cost to cover given mask of possible strings by using only first X possible sets. This solution is already good enough to solve first subtask.

How to handle second one? We have 3^5*2^32. A bit too much First let’s make 2^16 out of 2^32. Divide all possible sets in two groups - those which don’t have letter A/a and those which have. We have to cover all possible suffixes (without letter A/a), and for every suffix we have 2 possibilities - either it is covered by set without letter A/a, or it have to be covered in other way (then there should be two used sets for it - one with A and one with a).

Now we can solve 3 similar tasks - do a 3^4*2^16 dp for covering suffixes with sets containing A, with sets containing a, and with sets without A/a. Then try all possible variants of splitting strings between (A/a+) and (A/a-) groups and pick best one.

This solution should already be good enough to pass a single testcase within 1.2 seconds, but for 120 tests you have to optimize it a lot

Throw away dominated sets - there are no reason to take Ab or AB when you can take A. Throw away sets of size N - you can always take them in greedy way later, there is no need to put them in DP. Check only reached states of DP, there is no need to run a cycle over all 2^16 states every time. Sort sets by size and update DP with short sets first to postpone growing of set of reachable states.

With those optimizations my solution runs precisely 1.20 seconds on one of tests in subtask 2 But I guess it can be further optimized.

rogersands123 · April 14, 2015, 6:24pm

“During all these parties something had gone wrong. We can notice that when there is subset Ab(A was in good mood, but B was in bad mood) party is spoiled and the mood of friend C doesn’t matter”

What this lines means How it is violating any condition?

dreamplay · April 14, 2015, 8:51pm

Is this problem an application of Quine Mcluskey algorithm ?

mkrjn99 · April 14, 2015, 8:58pm

As far as the explanation goes, I don’t think this is any different from reducing a canonical SOP expression to reduced SOP form(correct me if I am wrong).

If my understanding is correct, tell me the answer for this case:


1
3 4
abc
abC
aBC
Abc

My program gives output 6 (reduced form is: ab+aC+bc). I downloaded a correct judged solution and it gives output 4. What’s the explanation for this case?

shubham011 · April 15, 2015, 3:44am

please give me some test cases where my solution give wrong answer… it should it be easy to find because I got AC only in 1 test case and I can’t find out where my solution give wrong answer

Admin please
http://www.codechef.com/viewsolution/6777820

mappinator · April 15, 2015, 5:15am

I just random 1024 times and choose subsets which can cover the most parties first.

However, it works.

bangga · May 2, 2015, 12:07pm

Terima kasih dan salam kenal.
codechef Link

code Link

shubhamelle94 · May 2, 2015, 4:17pm

link please

gonglinyuan · June 9, 2015, 8:58am

so nice…

dreamplay · April 14, 2015, 9:11pm

aC + bc is the right answer because ab is not needed here !!
But I m still not sure why this problem is not SAME AS k map SOP thing . OR is it ?

mkrjn99 · April 15, 2015, 12:16am

Why is ab not needed?

dreamplay · April 15, 2015, 1:50am

because ab represents abc ,abC …both already covered by aC-abC/aBC and bc-abc/Abc… hopefully u understood

mkrjn99 · April 15, 2015, 12:10pm

I got my mistake. Thank you!
Btw, my assumption is correct(canonical -> reduced SOP). The mistake is in my K-Map implementation.

dreamplay · April 15, 2015, 4:46pm

I have the same situation that i get wa on all test cases for N > 3 whereas i can`t find a single testcase where i got wrong :
I think we all(those who solved using Quine Mcluskey or k maps ) have some common mistake.
My doubt is that “THe minimum total size” and "Minimum number of total terms " .Is there any situation when both cases get different results ? bcz what we did was minimum terms and then count the terms…, Here is the link to my solution CodeChef: Practical coding for everyone

gvaibhav21 · April 16, 2015, 5:16am

for n=3, there are not much possibilities, and hence much less room for error, but for n=4,5, k-maps are much more complicated and most solutions crumbled in larger cases…