Answers to: TWOROADS - Editorialhttps://discuss.codechef.com/questions/23742/tworoads-editorial<h1>Problem Link:</h1>
<p><a href="http://www.codechef.com/problems/TWOROADS">Practice</a><br>
<a href="http://www.codechef.com/SEPT13/problems/TWOROADS">Contest</a></p>
<h1>Difficulty:</h1>
<p>Hard</p>
<h1>Pre-requisites:</h1>
<p>High School Geometry</p>
<h1>Problem:</h1>
<p>Given a set <strong>S</strong> of <strong>N</strong> points in plane, you have to draw two lines such that sum of squares of distance of each point from nearest line is smallest.</p>
<h1>Explanation:</h1>
<p>The first thing one realizes about this problem is that the business of taking minimum of the distances is causing all the trouble. If we had to draw a single line to optimize sum of squares of its distance from all points, it would be a piece of cake. High school calculus can be used to find the optimum equation of the line. Here is a brief sketch:</p>
<p>Let the equation of optimum line be <strong>y +mx +c = 0</strong>. The square of distance of a point <strong>(x0,y0)</strong> from this line is <strong>(y0 + mx0 + c)<sup>2</sup>/(1 + m<sup>2</sup>)</strong>. Therefore, sum of squares of distance of all points is</p>
<blockquote>
<p>(Y2 + m<sup>2</sup>X2 + c<sup>2</sup> * N + 2 * m * XY + 2 * Y * c + 2 * m * X * c) / (1 + m<sup>2</sup>)<br>
where Y2 = sum<sub>i</sub> y<sub>i</sub><sup>2</sup>, X2 = sum<sub>i</sub> x<sub>i</sub><sup>2</sup>, XY = sum<sub>i</sub> y<sub>i</sub>x<sub>i</sub>, X = sum<sub>i</sub> x<sub>i</sub>, Y = sum<sub>i</sub> y<sub>i</sub>, N = sum<sub>i</sub> 1</p>
</blockquote>
<p>We need to find optimum <strong>m</strong> and <strong>c</strong>. To do this, first differentiate with respect to <strong>c</strong> to get <strong>c</strong> as a (linear)function of <strong>m</strong>, plug it in the above, and differentiate with respect to <strong>m</strong> to get optimum value. Refer to solutions at the end for more precise details. Assuming that the one line case can be solved in constant time, given <strong>X, Y, XY, N, X2, Y2</strong>, we can now proceed.</p>
<p>Now imagine the optimal pair of lines. Let the lines be named <strong>A</strong> and <strong>B</strong>. Every point in our set <strong>S</strong> is closer to exactly one of <strong>A</strong> and <strong>B</strong>. Let <strong>S<sub>A</sub></strong> be the set of points in our <strong>S</strong> closer to <strong>A</strong> than <strong>B</strong>. Similarly define <strong>S<sub>B</sub></strong>. Note that <strong>S<sub>B</sub> = S - S<sub>A</sub></strong>.</p>
<p>Now suppose, by some magic, we managed to find the set <strong>S<sub>A</sub></strong>. Then we would again be done. This is because line <strong>A</strong> is the optimal line for the set <strong>S<sub>A</sub></strong>, and line <strong>B</strong> is the optimal line for remaining points.</p>
<p>Cool ! So, we now have a <strong>O(N * 2<sup>N</sup>)</strong> solution, which basically iterates over all subsets <strong>S<sub>A</sub></strong> of <strong>S</strong>, and reports the solution.</p>
<p>Obviously, the next step is to note that sets <strong>S<sub>A</sub></strong> and <strong>S<sub>B</sub></strong> cannot be arbitrary. If we are given a pair of lines <strong>A</strong> and <strong>B</strong>, then <strong>S<sub>A</sub></strong> consists of exactly those points in <strong>S</strong> which do not lie in the colored region. Similarly, <strong>S<sub>B</sub></strong> consists of those points which lie in the colored region. Justification is very straightforward and left to the reader. The colored region can be identified by a pair of perpendicular lines. For any two perpendicular lines <strong>L, M</strong>, let <strong>R<sub>L,M</sub></strong> denote the the first and third quadrant of the co-ordinate system defined by line <strong>L, M</strong>. Formally, <strong>R<sub>L,M</sub> = {points P | tan ∠ PXL ≥ 0}</strong>, <strong>X</strong> being intersection point of <strong>L</strong> and <strong>M</strong>.</p>
<p><img src="http://www.codechef.com/download/2in53t.png" width="65%"></p>
<p>Due to discussion above, we know that the set <strong>S<sub>A</sub></strong> cannot be arbitrary. The set <strong>S<sub>A</sub></strong> has to be such that there exist two perpendicular lines <strong>L</strong> and <strong>M</strong>, so that <strong>S<sub>A</sub> = S ∩ R<sub>L,M</sub></strong>.</p>
<p>In fact, we give a list <strong>Z</strong> of pairs of lines(i.e. <strong>Z={(L<sub>1</sub>, M<sub>1</sub>), (L<sub>2</sub>,M<sub>2</sub>), ... (L<sub>M</sub>,M<sub>M</sub>)}</strong>) such that, for any arbitrary pair of lines <strong>(L, M)</strong> we can obtain another pair of lines <strong>(L', M')</strong> which satisfies the following<br>
* <strong>S ∩ R<sub>L,M</sub> = S ∩ R<sub>L',M'</sub></strong><br>
* <strong>(L', M') ∈ Z</strong></p>
<blockquote>
<p>In other words, the possible lines <strong>L</strong> and <strong>M</strong> can be infinite in number, but we can find a finite set of pairs of lines which still induce all possible partitions of the given set <strong>S</strong>. In fact we can show that the required number of pairs of lines is polynomial in <strong>N</strong>.</p>
</blockquote>
<p>Here is how to obtain <strong>L', M'</strong> from any <strong>L, M</strong>.<br>
<img src="http://www.codechef.com/download/m6hcp.png" width="65%"></p>
<ul>
<li>
<p>Translate <strong>L</strong>, parallel to itself towards right, until <strong>L</strong> <em>hits</em> some point <strong>P<sub>1</sub> ∈ S</strong>. Stop translating when it is infinitesimally small distance away from <strong>P<sub>1</sub></strong>.<br>
<img src="http://i43.tinypic.com/2zdmv4h.png" width="65%"></p>
</li>
<li>
<p>Now translate <strong>M</strong>, parallel to itself towards right, until it <em>hits</em> some point <strong>P<sub>2</sub> ∈ S</strong>. Again stop translating when distance between line and point becomes infinitesimally small.<br>
<img src="http://www.codechef.com/download/2n7i2qx.png" width="65%"></p>
</li>
<li>
<p>Rotate both <strong>L</strong> and <strong>M</strong> clockwise so that i) <strong>M</strong> remains perpendicular to <strong>L</strong>, ii) One end point of <strong>L</strong> is fixed at <strong>P<sub>1</sub></strong>, iii) One end point of <strong>M</strong> is fixed at <strong>P<sub>2</sub></strong>. The intersection point of <strong>L</strong> and <strong>M</strong> moves in a circle with <strong>P<sub>1</sub>P<sub>2</sub></strong> as diameter. Stop rotating when one of <strong>L</strong> or <strong>M</strong> <em>hits</em> some point <strong>P<sub>3</sub> ∈ S</strong>. Again, stop rotating when it is infinitesimally small distance away from <strong>P<sub>3</sub></strong>.<br>
<img src="http://www.codechef.com/download/206fym9.png" width="65%"></p>
</li>
</ul>
<p>Now it is clear that the new lines <strong>L', M'</strong> obtained by above process still induce same partition of <strong>S</strong> as <strong>L, M</strong> did. Moreover, <strong>L', M'</strong> are completely defined by<br>
* Points <strong>P<sub>1</sub>, P<sub>2</sub>, P<sub>3</sub></strong>.<br>
* One bit indicating whether <strong>P<sub>3</sub></strong> was <em>hit</em> from above or below. It doesn't matter whether <strong>L'</strong> actually hit <strong>P<sub>3</sub></strong> or <strong>M'</strong> because they are interchangeable.</p>
<p>The set <strong>Z</strong> of all possible final pairs <strong>(L', M')</strong> obtained after translation/rotation has size <strong>2n<sup>3</sup></strong>, and can be easily enumerated. We can solve the problem in <strong>O(n^3)</strong> overall time as well, because the partitions imposed by these lines can be interconverted by adding/removing single points, and hence <strong>(X, Y, XY, X2, Y2)</strong> tuple can be re-calculated in constant time. See the solutions below for exact details.</p>
<h1>Setter's Solution:</h1>
<p>Can be found <a href="http://www.codechef.com/download/Solutions/2013/September/Setter/TWOROADS.cpp">here</a></p>
<h1>Tester's Solution:</h1>
<p>Can be found <a href="http://www.codechef.com/download/Solutions/2013/September/Tester/Tester2/TWOROADS.cpp">here</a></p>
<h1>Editorialist's Solution:</h1>
<p>Can be found <a href="http://www.codechef.com/download/Solutions/2013/September/Editorialist/TWOROADS.cpp">here</a></p>enFri, 04 Oct 2013 11:37:14 +0530Answer by tuananh93https://discuss.codechef.com/questions/23742/tworoads-editorial/25733<p>Hi zenon! Thank you for pointing this out,
The problem here is the precision of float number. You put to much digits after the decimal point so our solution will not accurate anymore. Let say the double type can represent exactly x digits after decimal point (x depends on the range of the integer part also). Then if you use multiplication ie to calculate the projection then the input should have only x/2 digits after decimal point, Similarly, if you have product of 3 numbers somewhere in your program, then the i put should have x/3 digits after the point.
In our test cases there is about 3-4 digits after the decimal point only. Actually we should inform that in the problem statement, sorry about that.</p>tuananh93Fri, 04 Oct 2013 11:37:14 +0530https://discuss.codechef.com/questions/23742/tworoads-editorial/25733Answer by zennonhttps://discuss.codechef.com/questions/23742/tworoads-editorial/25371<p>Can anyone explain me why my solution finds better answer than the author's one?</p>
<p>Here is <a href="http://pastebin.com/yg62cdsW">test</a> and I guess the optimal partition looks like <a href="http://pastebin.com/Uaqz4nH6">this</a>. Strange thing is when I try to solve 1-line problem for these partition sets, my and author's solutions give the same total result, but it seems like author's solution doesn't find this partition when it solves the test mentioned above.</p>
<p>My answer is 16555.8836279 and author's is 16559.9967756. Here is <a href="http://pastebin.com/jD8HhxaK">my code</a> by the way. I hope someone will explain me this.</p>zennonMon, 30 Sep 2013 01:08:36 +0530https://discuss.codechef.com/questions/23742/tworoads-editorial/25371Answer by zennonhttps://discuss.codechef.com/questions/23742/tworoads-editorial/25056<p>Why does the answer always contain two intersecting lines? Isn't it possible to achieve better result with two parallel lines in some cases?</p>zennonWed, 25 Sep 2013 17:13:31 +0530https://discuss.codechef.com/questions/23742/tworoads-editorial/25056Answer by baukamanhttps://discuss.codechef.com/questions/23742/tworoads-editorial/24159<p>I think approach to this problem is N^3logN. </p>
<p>Two outer loops gives us n^2. And inside it we have sorting(respect to projections to line) which is NlogN. Am I correct ?</p>baukamanWed, 18 Sep 2013 13:21:19 +0530https://discuss.codechef.com/questions/23742/tworoads-editorial/24159Answer by betlistahttps://discuss.codechef.com/questions/23742/tworoads-editorial/23997<p>I'd like to ask for tips, how to solve such hard problem, respectively how to test them. First option is to create some test cases with known results. But even for those I asked here earlier, my expected result was wrong. Only idea I found is to find mean line for every triple of points, calculate distances so I will have some upper bound. I can do that for 50 points. Typically I implement some brute force solution to get answers for random test cases, but here I have no idea how to achieve this...</p>betlistaTue, 17 Sep 2013 13:25:06 +0530https://discuss.codechef.com/questions/23742/tworoads-editorial/23997Answer by baukamanhttps://discuss.codechef.com/questions/23742/tworoads-editorial/23930<p>If someone can fail this solution:
Consider all lines that is pairwise combination of given points. For each line we will consider it's top and bottom parts. For each part we solve our known calculus thing. I got WA. Someone plz tell what test fails this approach ?</p>baukamanTue, 17 Sep 2013 09:27:35 +0530https://discuss.codechef.com/questions/23742/tworoads-editorial/23930Answer by betlistahttps://discuss.codechef.com/questions/23742/tworoads-editorial/23924<p>Hi guys, can someone help me to find the roads for those three inputs?</p>
<pre><code>6
-1 10
1 10
-3 12
3 12
-1 -10
1 -10
</code></pre>
<p>and</p>
<pre><code>5
-1000 1000
0 999
1000 1000
-1 -1000
1 -1000
</code></pre>
<p>and</p>
<pre><code>9
-1 -1000
1 -1000
-500 1000
500 1000
0 997
1000 999
1000 998
-1000 999
-1000 998
</code></pre>
<p>Testers solution fails for all the inputs. Setter's and editorialist's solution returns 0.414866845 for first input, 0.133333333 for second and 0.666666667 for the third, but it seems to me less than I'd expect. If possible, give me the roads in <code>y = a*x + b</code> form, thanks.</p>betlistaTue, 17 Sep 2013 09:11:09 +0530https://discuss.codechef.com/questions/23742/tworoads-editorial/23924Answer by forthrighthttps://discuss.codechef.com/questions/23742/tworoads-editorial/23839<p>When I first read the problem, I realized I had to find regression line of N points. In this case instead of one, I had to find two. I didn't know how to find one line, let alone two. So I googled and found an interesting paper on K-line mean : <a href="http://people.csail.mit.edu/dannyf/mscthesis.pdf">http://people.csail.mit.edu/dannyf/mscthesis.pdf</a></p>
<p>In the paper, they describe an O(n^3) algorithms for 2-line mean. They use vectors and linear algebra to solve the problem (I think!). If only I was good at linear algebra then maybe I could have understood this article (all those matrix decomposition and vector products boggled my mind). I tried studying linear algebra books for few days but they didn't cover these advanced topics (SVD for example). Plus reading 500+ pages of a books didn't seem efficient.</p>
<p>So anybody here solved the problem using this paper or linear algebra? Any resource that will help me with linear algebra (and this problem). </p>forthrightMon, 16 Sep 2013 17:26:18 +0530https://discuss.codechef.com/questions/23742/tworoads-editorial/23839