CHEFARMY - Editorial

vijju123 · March 1, 2020, 8:52pm

PROBLEM LINK:

Div1
Div2
Practice

Setter- Alei Reyes
Tester- Encho Mishinev
Editorialist- Abhishek Pandey

DIFFICULTY:

HARD

PRE-REQUISITES:

The optimal solution uses Faigle Kern algorithm from the linked paper (Read upto page 7).

But due to the unique format of this problem, the pre-requisites by and large are simply:

A general Introduction to Graph Theory , BFS and DFS , Concepts on Linear Programming basic theory on Partially Ordered Sets

PROBLEM:

You are given a tree representing N soldiers. A soldier u is afraid of another soldier v if node u lies in subtree of v. There are also M generals and you also given if a general hates a soldier or not.

Now, you need to give each soldier i his salary S_i by following method-

Choose number of days D by which you will give everyone his salary. 0 \leq D \leq N^2
For each day, chose an integer K and call K soldiers. You then decide another number R and hand each soldier R coins.
If any general hates any of the K soldiers called, they become mad. If say, G generals are mad, their angriness increases by G*R

Find best method to pay salary such that sum of total angriness is minimized.

QUICK-EXPLANATION:

Key to AC- While an optimal solution is possible using the research paper linked, the general trend of making equations (the exact equation depended on Participant) and using Linear solvers to solve them dominated the top scores.

The optimal solution was to obviously apply to Faigle Kern algorithm given in this research paper. But the problem has a very special format - and that allows us to discuss a lot more than just the optimal solution.

Two of the high scoring techniques were:

Interestingly, solving the problem without any regard to angriness of generals. This approach just decided to try to maximize the soldiers called on a single day and got \approx 75-85 points.
Some of the top scoring solutions used Linear Programming. They formulated the problem of which group to call in what order as a set of linear equations and solved them.

EXPLANATION:

The optimal solution is actually very verbosely given in the research paper, along with all the required proofs. Hence, we will not discuss a lot about that because that will make the editorial unnecessarily verbose and boring. Instead, the plan is to briefly discuss the following:

The interpretation of constraints and test generation plan.
Some notable observations made by top-scorers.
Discuss the optimal solution briefly.
Analyze Contestant’s solutions

There will be no separate section for solutions - I will mention solution links along with the described approaches.

Interpretation of Constraints and Test Generation plan

This section is the first key to making observations which can make you one of the top scorers . And this applies to all approximate/challenge problems!

First note that we are given a generous 7 seconds for 50 test cases. Not bad actually! N is pretty small with a value of 128 and M is merely 16. The most interesting constraint however, is that of S. S_i \leq 100 !!! Thats very lenient because it allows a lot of approaches to open up - primarily due to the fact that now \sum S_i \leq D \leq N^2 is possible.

The next interesting thing is in the output section. It mentions that " For each valid i, let the total amount of money received by soldier i be W_i. Your answer will be considered correct if |Wi−Si| \leq 10^{−6}"

This is the BIG hint that some sort of equation solving method is intended to yield high score - as (somehow) it allows us to give R coins where R may not be an integer. This also highlights the fact that the equations will have to deal with floating points - which makes it a wise decision to use some foolproof library instead of writing your own code to avoid loss of precision etc.

The test generation plan is very simple.

S_i are randomly selected - hence the expected value of S_i is (1+100)/2=50.5. Hence, the expected value of \sum S_i=50.5*N=6464. The standard deviation of a uniform distribution, as proved here is \large \frac{b-a}{\sqrt 12}. Using formula’s to find standard deviation for linear transformation (because we want standard deviation for \sum S_i or N times sum of random variable S_i), we realize that standard deviation is \approx 684. Hence, by using Chebyshev’s Theorem, we can conclude that 95 \% of our data will have \sum S_i in range 6464 \pm 3*684. This means that even if we give a soldier 1 coin daily, we will be able to give everyone salary. This observation opens up possibilities of new approaches for you.
The tree is biased-random. A LOT of things are given away by this. Firstly, the expected depth of such a tree is always O(LogN) , which can be proved in a similar way as proved here. You can do a lot of heuristics - based on degree of node, exploiting the “structure” of such random trees and what not! Why do I call it biased random? Well, there are some biases to exploit in this generation plan - like, see for how many nodes Node 1 is a candidate for being the parent. From my experience, many people would analyze the trees generated by this plan to exploit any such “common-structure” in them.
The next two constraints tell that we have both cases - where generals hate very less soldiers and where they hate majority of soldiers. You will need to handle both cases carefully to get a top score.

Some notable observations made by Top Scorers

The most important observation for this problem was the partial sub-modularity of the angriness function. Let us always give R=1 coins each time we call the generals. Let F(X) denote the angriness function if the set X of soldiers is called. F(X) being sub-modular means that:

F(A)+F(B) \geq F(A \cup B)+ F(A \cap B)

Lets understand it with an example. Say we have 2 soldiers, 1 and 2. Forget about the part that one of them would be afraid of other for now. Say we have 4 generals, 1,2,3,4. Generals 1,2,3 hate soldier 1 and Generals 2,3,4 hate soldier 2. Assume they both need to be given only 1 coin.

Now clearly, F(1) = 3, F(2)=3. Also, F(1 \cup 2)=4 and F(1 \cap 2 = \phi)=0.

A formal proof is given at the bonus corner. This proof is also necessary to convince yourself that the optimal solution using Faigle Kern algorithm is correct.

The next observation is a bit indirect - we realize that the relationship between soldiers is that of a Partially Ordered Set (POSET). To see why, just replace “fears” with Comparable. A soldier u is comparable to soldier v iff they have an edge between them which represents some sort of ordering or ranking. Else they are not comparable.

Now everyday, we need the list of soldiers such that each of them is incomparable to everyone else. Such a list of soldiers (or nodes in POSET) is called antichains. Pretty fancy name for a simple concept - but knowing the formal name of what you are looking for helps.

The next observation is, there is no extra overhead for calling a soldier multiple times! Hence, if you call a soldier and give him all his money at once, or call him multiple times and give him money, the angriness of generals increases by same amount, which would be G*S_i.

So does this mean brute force is optimal? Certainly not! You can always chose a set of soldiers such that most of them piss off a common General and hence optimize the overall angriness. And there are exponential possibilities on how we can find the “optimum set” of soldiers to call each day. This was the defining line on why same algorithm or idea got different points for various submissions.

The other observations related to Constraints and Test-Generation plan have already been discussed, and we are now ready to discuss what interesting approaches others used!

Brief Note on Optimal Solution

Before we delve into what other contestant’s did, we should definitely check out at least the basic idea of what the optimal solution was - even if its given out verbosely in the paper.

Faigle Kern’s algorithm is a greedy algorithm. The paper first defines closure of a set and uses it to define extreme points of a set. It then proceeds as follows:

At each iteration, consider all extreme-points of the set. Find the minimum weight/value (in our case, coins to give) with respect to that point (in our case, soldier). Let this value be w_e corresponding to Node e.
Add w_e to values (i.e. give w_e coins to all soldiers) which are extreme-points of the set.
Remove the chosen Node e.

This is a very high-view of the algorithm, because you will note I did not go into what is our “set” or how do we find the extreme-points of this set. That part is completely mathematical and has to be seen from the paper itself.

Analyzing Contestant's Solution

All scores mentioned in the section are from Div1 unless mentioned otherwise.

The trivial solution of calling one soldier daily and giving him his required wage got 11.5 points in Div1. A reference solution is here.
Some contestants optimized above by calling more than one soldier if they all pissed of the exact same set of generals. This gave them a slightly higher score than brute force. Another optimization was calling 2 soldiers whenever possible (reference solution here ).
A majority of the solutions in Div 1 used simple BFS to score 54.7 points. What they did was to observe that soldiers at same level in the graph cannot be afraid of each other - as they cannot lie in subtree of each other! Hence, they would constantly call the entire level of soldiers and distribute money until each soldier has got his needed amount. Then they’d proceed to the next level. Reference Solution - here
Slight optimizations to above were possible which led to slightly more scores. One such optimization was that while processing a level, if a soldier is done, see if you can include his parent in set or not & so on.
Then there is a distinct block of solutions getting 79.4 points. The chief difference between these solutions and previous discussed solution is that - instead of just checking soldiers at a level, this solution keeps a list of all soldiers. It then sorts the list in increasing order of S_i. It chooses the front element of the list, removes all its superiors from the list, and then makes leftover soldiers report along with first element (soldier) in the list. This solution is more optimized because it increases number of soldiers called - which means more chances of soldiers which piss common general getting called, hence more optimization. You should also refer to @david_s 's answer here as he also explained what he coded. Don’t forget to hit a like there!
The difference between this and next set of solutions was on exploitation of calling soldiers which make a common general angry. One trick that these solutions did was to replace node u with a chain of s[u] nodes and started deleting nodes from leaves. A solution which simply deletes leaves (without the chain trick) got 88.2 points as seen [here] (CodeChef: Practical coding for everyone) .
The chain trick was more of a convenience - because with R=1 for every set of soldiers called, your sole focus is to minimize number of angry generals. One of the methods where this trick was used was to combine the BFS trick (calling soldiers at each level) with this chain trick to increase soldiers with common-generals. Reference Answer: here .
Another way to chose “optimum” or “almost-optimum” set of soldiers such that number of angry generals is minimized was to model this selection of soldiers as a system of linear equations. How and what insights or constraints you use to model them primarily determined the score. For example, you do not compulsorily need to model equation for the exact problem as given - you can use suitable relaxations or self-defined additional constraints if it makes your life easier (except that its a tradeoff between quality of equations and their feasibility).
These linear equations can be solved with Simplex or Gauss-Jordian methods.

I will end the editorial with some cool top-scoring solutions which I feel are friendly enough to understand:

@he_____he 's solution here
@wrong_answer_1 's solution here
@zhouyuyang 's solution here
@utaha 's solution here
@nagrar0k 's solution here
@cjy2003 's solution here

I’d also like to invite everyone to share there approaches as well

CHEF VIJJU’S CORNER

Proof on sub-modularity

Assign each soldier a number, such that the i'th bit of this number is set if that soldier makes i'th general angry. Say we have two soldiers with numbers assigned as X and Y. It is easy to see that total number of angry generals if both of them are called is nothing but number of 1's in binary representation of X | Y , i.e. (X bitwise-OR Y). Its easy to see this holds true for even when there are more than 2 soldiers.

Say we have 2 sets of soldiers, A and B. We already set R=1, hence making angriness function for a day equal to number of equal generals.

Lets say F(A) = N_1,F(B)=N_2 and there are K generals common which get pissed off if soldiers from either set A or B are called.

LHS=F(A)+F(B) = N_1+N_2

Now the RHS = F(A \cup B) + F(A \cap B)
= (N_1+N_2-K) + F(A \cap B)

Now, we know that F(A \cap B) can have an at most value of K, because in maximum-case, all soldiers which piss off common generals are all present in F(A \cap B). But in case any of the soldier is absent, then F(A \cap B) will be less than K. This forces RHS to be less than N_1+N_2.

Hence, proved.

A trick to conveniently represent what generals hate a soldier

Wondering what that mask thing is about in the top solutions? They chose a very well method to represent which generals are angry with a soldier! We will call this “mask” of soldier - a number which conveniently represents which generals hate that soldier.

Since number of generals are 16, we can take any data-type which has more than 16 bits, like int. Initialize the mask for all soldiers as 0. Now, write it out in binary form such that:

The i'th bit of the mask is set iff the i'th general hates that soldier!

Why is it good? Because now you can use bitwise operators (AND, OR) etc. to do speedy checks. For example, the number of generals which get angry if to soldiers A and B are called is given by number of 1's in binary representation of A | B. Similarly A \& B gives us number of common-generals pissed off by them.

Hence, this representation provides you with mechanisms to make your life, implementation easier and also save some of that time for more computation!

Setter's Notes

The relationship X is subordinate of Y is a POSET
Note that every day we chose an antichain in that POSET
The angriness function is submodular, that means that
f(A)+f(B) \geq f(A \cup B) + f(A \cap B), where f(X) is the number of angry generals if soldiers from set X are called
The optimization problem is :
- minimise { \sum f(X)*R(X), for every antichain X}, R(X) is how much each soldier is paid
- subject to sum {R(X), for each antichain X containing x} = S_x, i.e soldier x should be paid Sx
For special kind of POSETS this optimization problem can be solved using the Faigle-Kern algorithm which involves
- finding a special topological sort of the tree
- solving a system of linear equations.

The problem is very technical, that is why I decided to use the challenge format.

However with the given test case generation (and definitely not the dark powers of Sauron bestowed upon me eons, eons ago), I expect many solutions that use approximate algorithms with near-optimum scores.