Answers to: FOURSQ - Editorialhttps://discuss.codechef.com/questions/90269/foursq-editorial<p><a href="https://www.codechef.com/problems/FOURSQ">Practice</a> <br>
<a href="https://www.codechef.com/JAN17/problems/FOURSQ">Contest</a> </p>
<p><strong>Author:</strong> <a href="https://www.codechef.com/users/sidhant007">Sidhant Bansal</a> <br>
<strong>Tester:</strong> <a href="https://www.codechef.com/users/iscsi">Istvan Nagy</a><br>
<strong>Editorialist:</strong> <a href="https://www.codechef.com/users/mgch">Misha Chorniy</a></p>
<h1>Difficulty:</h1>
<p>Medium</p>
<h1>Pre-Requisites:</h1>
<p>segment tree or related data structure</p>
<h1>Problem Statement</h1>
<p>You are given array $A$ consisting $N$ elements. All the elements in the input are integers. You must be able to make two queries, first one - change the value of the element, second one - is to
present the product of elements with indices from l to r by modulo P in the sum of the squares of four non-negative integers.
</p><h1>Explanation</h1>
<p>We need to make some precalculation, for each number between 0 and 1000000 know to calculate his decomposition in the sum of four squares. The key idea in this problem, if we can decompose number $A$ in the sum of four squares and $B$ in the sum of four squares, then $A*B$ <a href="https://www.alpertron.com.ar/4SQUARES.HTM">can be decomposed</a> in the sum of four squares. </p>
<p>$A*B$ = $(x_{1}^2+x_{2}^2+x_{3}^2+x_{4}^2)*(y_{1}^2+y_{2}^2+y_{3}^2+y_{4}^2)$ = </p>
<p>$(x_{1}*y_{1}+x_{2}*y_{2}+x_{3}*y_{3}+x_{4}*y_{4})^2 + $</p>
<p>$(x_{1}*y_{2}-x_{2}*y_{1}+x_{3}*y_{4}-x_{4}*y_{3})^2 + $</p>
<p>$(x_{1}*y_{3}-x_{2}*y_{4}-x_{3}*y_{1}+x_{4}*y_{2})^2 + $</p>
<p>$(x_{1}*y_{4}-x_{4}*y_{1}+x_{2}*y_{3}-x_{2}*y_{3})^2 $</p>
The second observation is <a href="https://en.wikipedia.org/wiki/Lagrange's_four-square_theorem">Lagrange's four-square theorem</a>, it's well-known fact in math. Every number can be represented in the sum of four squares, from here there are no $-1$ answers in the problem.
<p></p>
<h1>Subtask 1</h1>
<p>$P$ is less or equal than $1000$. Let's make precalculation in a next way, iterate with four cycles, where each number less than $sqrt(1000)<32$ </p>
<p><code>
for i = 0..32
for j = i..32
for k = j..32
for l = k..32
s[i<em>i+j</em>j+k<em>k+l</em>l]=(i,j,k,l) //t is the array of tuples from 4 numbers
</code>
</p><p>When we know this thing, numbers can be multiplied and values of the sum of 4 square squares can be recalculated. Of course, all operations must be processed by modulo $P$. After that queries can be simulated, the first one in $O(1)$ time, the second one in $O(N)$ time. Totally, it takes $O(Q*N+32^4)$ time.</em></p><em>
<h1>Subtask 2</h1>
<p> Precomputation with four for's as in Subtask 1 get TL for $P=10^6$. How to speed up this one, let's make next thing, find all numbers which can be represented as the sum of 3 square numbers.</p>
<code>
for i=0..1000
if i</code></em><code>i>1000000
break
for j=i..1000
if i<em>i+j</em>j>1000000
break
for k=j..1000
if i<em>i+j</em>j+k<em>k>1000000
break
t[i</em>i+j<em>j+k</em>k]=(i,j,k) //t is the array of tuples from 3 numbers
s[i<em>i+j</em>j+k<em>k]=(i,j,k,0)
</em></code><em>
<p>We will find almost $85$% of representions after this preprocessing. How to find the rest:
<code>
for i=0..1000000
if s[i] != null
continue
for j=1..1000
if i - j * j<0
break
if s[i-j * j][3]==0 // If number i - j * j can be represented in a sum of 3 squares
s[i]=(s[i-j * j][0],s[i-j * j][1],s[i-j * j][2],j)
break
</code>
</p>
</em><p><em>Let's estimate total time of work of these two fragments of code, first fragment has three for's each of those to 1000, but $0<=i<=j<=k<=1000$, number of possible triplets (i, j, k) the same thing as $1000 \choose 3$. The second fragment works in time which equal number of numbers which can't be represented as a sum of 3 squares multiply by 1000 less than $2*10^5*1000$ = $2*10^8$, summary both fragments will process in time less than $5*10^8$. </em></p><em>
</em><p><em>Now we have all decompositions for numbers in the range between $0$ and $10^6$, but we don't know how to speed up processing of the 2nd query. A function of representation in a sum of 4 square is associative and multiplicative, we can use <a href="https://www.quora.com/What-are-some-good-tutorials-on-segment-trees">segment tree</a> for solving this problem. In each node $v$ which assigned with interval $l..r$ storing decomposition of the product $A_{l}*A_{l+1}*...*A_{r}$ in the sum of 4 squares by modulo $P$. </p>
<p> Now both queries can be done with complexity $O(log N)$ </p>
<h1>Subtask 3</h1>
<p>The difference of a subtask 3 from a subtask 2 that $P$ can be up to $10^{12}$. It does problems because 64-bit integers type can store numbers up to $1.8 * 10^{19}$, how to cope with a multiplication of two numbers which can't be stored in the 64-bit integer type, there are several ways:</p>
<h3>First way: </h3>
<code>
mult(a, b, p)
if b == 0
return 0
if b % 2 == 0
a2=mult(a,b/2,p) // Finding a * (b / 2), a * b = a * (b / 2) + a * (b / 2)
return (a2+a2) mod p // (a+...+a)+(a+...+a)
return (a+mult(a,b-1, p)) mod p //a * b = a * (b - 1) + a
</code>
<p> </p>
<p>Complexity of this algorithm is O(log N), unfortunately, this algorithm can't pass TL</p>
<h3>Second way:</h3>
<p>In C++ exists types like <strong>int128, </strong>int128_t, Python and Java has libraries for working with Long Arithmetics, but all of it is very slow to get AC.
</p><h3>Third way:</h3>
Divide multipliers into two parts $X=A*2^{20}+B$, $0<=A, B<2^{20}$
<code>
mult(a, b, p)
a0=a % (2^20)
a1=a / (2^20)
b0=b % (2^20)
b1=b / (2^20)
//(a1</code></em><code>(2^20)+a0)<em>(b1</em>(2^20)+b0)=a1<em>b1</em>(2^40)+(a1<em>b0+b1</em>a0)<em>(2^20)+(a0</em>b0)
return ((a1<em>b1)%p</em>(2^20)%p<em>(2^20)+(a1</em>b0+a0<em>b1)%p</em>2^20+a0*b0)%p
</code>
If change % and / into bit operations, correct solution must get AC.
<h3>Fourth way:</h3>
To calcuate product in big real types(C++ version):
<code>
//a * b mod p = a * b - [a * b / p] * p
long long mul(long long a, long long b, long long p) {
long long k = (long long)((long double)a * b / p + 0.5);
return a * b - k * p;
}
</code>
<p>This optimization for multiplying numbers up to $10^{12}$ modulo $10^{12}$ is fastest.</p>
<p>The overall time complexity of this approach is $O(N log N)$ but with huge constant.</p>
<h1>Solution:</h1>
<p>Setter's solution can be found <a href="http://www.codechef.com/download/Solutions/JAN17/Setter/FOURSQ.cpp">here</a> <br>
Tester's solution can be found <a href="http://www.codechef.com/download/Solutions/JAN17/Tester/FOURSQ.cpp">here</a></p>
<p><strong>Please feel free to post comments if anything is not clear to you.</strong></p><p></p>enThu, 19 Jan 2017 16:50:19 +0530Answer by bansal1232https://discuss.codechef.com/questions/90269/foursq-editorial/90344<p>yes! <a href="/users/80870/manjunath1996">@manjunath1996</a> It will overflow! So we have to add another factor that will make the result in correct form. See my code below! It is correct over all the range upto 10^12</p>
<pre><code>long long mul(long long a, long long b, long long p) {
long long k = (long long)((long double)a * b / p + 0.5);
long long ans=a * b - k * p;
while(ans<0)ans+=p;
return ans;
}
</code></pre>bansal1232Thu, 19 Jan 2017 16:50:19 +0530https://discuss.codechef.com/questions/90269/foursq-editorial/90344Answer by manjunath1996https://discuss.codechef.com/questions/90269/foursq-editorial/90342<p>I have doubt in fourth way of multipllication.if a and b are long long types and suppose a=10^11 ,b=10^11 and p=10^12 then how can we directly multiply a<em>b as given in statement </em>return a * b - k * p; It should over flow there.Can any one clear my doubt??</p>manjunath1996Thu, 19 Jan 2017 16:23:53 +0530https://discuss.codechef.com/questions/90269/foursq-editorial/90342Answer by anuraag_s2https://discuss.codechef.com/questions/90269/foursq-editorial/90314<p>I used the same algorithms as discussed here, but,i don't have any idea why am i getting WA in 2 test cases 1 apiece in both the subtasks :(.
Please can anyone help me?</p>
<p>Link-
<a href="https://www.codechef.com/viewsolution/12600300">https://www.codechef.com/viewsolution/12600300</a></p>anuraag_s2Thu, 19 Jan 2017 01:02:03 +0530https://discuss.codechef.com/questions/90269/foursq-editorial/90314Answer by ridslk22https://discuss.codechef.com/questions/90269/foursq-editorial/90312<p><a href="/users/75650/suraj021"><a href="/users/75650/suraj021">@suraj021</a></a>, here's what i did....</p>
<p>1) each number except of the form 4^m(8k+7) can be expressed as sum of 3 square number. Therefore I try to find a square number close to the multiplication that, when subtracted from it, will give me a number that satisifies this condition. Like if 107 is the given number, closest square number is 100. But 107-100=7 is a number violating the condition. So I take the next closest square 81, 107-81=26, which is not of the mentioned format. So I will take my first square number as sqrt(81) = 9. I haven't proved it but this should not take more than 2 or 3 iterations.</p>
<p>2) After getting such a number, I try to form a three square representation. (in the example, 26) Its a very naive recursive method. I try to create the number by taking closest square numbers iteratively. I kept a dp map to avoid regenerating the same numbers.... I found out that most numbers can be formed on the very first attempt (by greedily taking the closest square numbers, like 26 for example: take closest square 25 and you get next two numbers 1 and 0 easily). However there are some numbers which may take upto quite a lot of iterations. But I had a feeling there wouldn't be too many. And my hunch was proved when I got AC. </p>ridslk22Thu, 19 Jan 2017 00:27:32 +0530https://discuss.codechef.com/questions/90269/foursq-editorial/90312Answer by noob333https://discuss.codechef.com/questions/90269/foursq-editorial/90310<p>Here is a fifth way which seemed most intuitive to me. Represent each number as <strong>x*10^3 + y</strong>. </p>
<p>Why? Because x will be less than 10^9 and you can still compute the product without overflow.</p>noob333Thu, 19 Jan 2017 00:00:09 +0530https://discuss.codechef.com/questions/90269/foursq-editorial/90310Answer by suraj021https://discuss.codechef.com/questions/90269/foursq-editorial/90306<p><a href="/users/127093/ridslk22">@ridslk22</a> How did you manage to do it online?</p>suraj021Wed, 18 Jan 2017 22:31:20 +0530https://discuss.codechef.com/questions/90269/foursq-editorial/90306Answer by ridslk22https://discuss.codechef.com/questions/90269/foursq-editorial/90299<p>Was I the only one who did all the computation for four squares of each number online and still managed to get 100 points?</p>ridslk22Wed, 18 Jan 2017 19:57:59 +0530https://discuss.codechef.com/questions/90269/foursq-editorial/90299Answer by aloochaat1998https://discuss.codechef.com/questions/90269/foursq-editorial/90298<p>Setter's solution gets 10 points.Not expected!!! ;)</p>aloochaat1998Wed, 18 Jan 2017 19:11:29 +0530https://discuss.codechef.com/questions/90269/foursq-editorial/90298Answer by surajghoshhttps://discuss.codechef.com/questions/90269/foursq-editorial/90297<p>What is zi-j*<em>j == 0 in the second fragment of the code ??
does that , mean j</em>j is a multiple of i ?</p>surajghoshWed, 18 Jan 2017 18:44:29 +0530https://discuss.codechef.com/questions/90269/foursq-editorial/90297Answer by robin_0_chefhttps://discuss.codechef.com/questions/90269/foursq-editorial/90283<p>Sorry I can't get one thing. In the 4th way of finding modulo we have returned ( a<em>b - k</em>p ) </p>
<p>But what would happen if a was like 10^11 , b = 10^11 and p = 10^12. Then won't a*b cause overflow for long long int ? :/ </p>
<p>The 3rd way is really elegant :) </p>robin_0_chefWed, 18 Jan 2017 16:23:58 +0530https://discuss.codechef.com/questions/90269/foursq-editorial/90283