Questions Tagged With aug15https://discuss.codechef.com/tags/aug15/?type=rssquestions tagged <span class="tag">aug15</span>enTue, 29 Mar 2016 06:55:52 +0530Feedback of August Long Challenge 2015https://discuss.codechef.com/questions/73753/feedback-of-august-long-challenge-2015<p>Hi All,</p>
<p>I hope you would have enjoyed participating in August Challenge 2015. Please feel free to share your feedback as the answers or comments of the question. We would be really thankful for your responses :)</p>dpraveenMon, 17 Aug 2015 19:04:43 +0530https://discuss.codechef.com/questions/73753/feedback-of-august-long-challenge-2015aug15feedbackADMAG - Editorialhttps://discuss.codechef.com/questions/73723/admag-editorial<h1>PROBLEM LINK:</h1>
<p><a href="http://www.codechef.com/AUG15/problems/ADMAG">Contest</a><br>
<a href="http://www.codechef.com/AUG15/problems/ADMAG">Practice</a> </p>
<p><strong>Author:</strong> <a href="http://www.codechef.com/users/abhra73">Abhra Dasgupta</a><br>
<strong>Tester:</strong> <a href="http://www.codechef.com/users/laycurse">Hiroto Sekido</a><br>
<strong>Editorialist:</strong> <a href="http://www.codechef.com/users/kevinsogo">Kevin Atienza</a></p>
<h1>DIFFICULTY:</h1>
<p>Simple</p>
<h1>PREREQUISITES:</h1>
<p>Fibonacci numbers, Zeckendorf representation</p>
<h1>PROBLEM:</h1>
<p>You have a sequence of $C$ cards, each containing a (nonempty) list of integers from $1$ to $N$. Furthermore, this list can identify any integer $1$ to $N$ this way: the integer $i$ can be obtained by adding the first integer in all cards containing $i$. Also, an integer cannot appear in two consecutive cards.</p>
<p>For a given $N$, what is the minimum possible $C$ for which such a card sequence exists?</p>
<h1>QUICK EXPLANATION:</h1>
<p>$C$ is simply the largest integer such that $F_{C+1} \le N$, where $F_i$ is the $i$th Fibonacci number ($F_0 = 0$ and $F_1 = 1$). One such possible set of $C$ cards is as follows:</p>
<ul>
<li>The $j$th card ($1 \le j \le C$) contains the integer $i$ ($1 \le i \le N$) if and only if $F_{j+1}$ is in the Zeckendorf representation of $i$.</li>
<li>Each card's list of integers is sorted (so that the smallest integer in the $j$th card is $F_{j+1}$).</li>
</ul>
<p>It's easy to see why this set of cards is valid. Proving that this is the optimal solution is also easy.</p>
<h1>EXPLANATION:</h1>
<h1>A simplified problem</h1>
<p>The problem is made a bit harder by the following constraints:</p>
<ul>
<li>Each integer must be the sum of the first integer in all cards containing it.</li>
<li>An integer cannot appear in two consecutive cards.</li>
</ul>
<p>Thus, we start by solving a simplified version of the problem first. In other words:</p>
<ul>
<li>Each integer need not be the sum of the first integer in all cards containing it (but it should be identified by which cards it appears in).</li>
<li>There is no restriction on the cards which an integer can appear in.</li>
</ul>
<p>With these in mind, let's see how we can set up the minimum number of cards to identify the integers from $1$ to $N$.</p>
<p>Of course, in this problem, the order of the integers in each card doesn't matter, so we can represent each card by a bit string of length $N$, where the $i$th bit is on if and only if the card contains $i$. Furthermore, a sequence of $C$ cards can be represented by a matrix of $C$ rows and $N$ columns. Each row represents a card and the integers it contains, and each column represents an integer and the cards it appears in.</p>
<p>We say that two integers are <strong>indistinguishable</strong> if they appear in exactly the same set of cards. Clearly, if there is any pair of indistinguishable integers, then there isn't enough information to identify either integer in the pair. Therefore, there must be no indistinguishable pairs of integers from $1$ to $N$. It's easy to see that having no such pairs is also sufficient to identify the integers from $1$ to $N$.</p>
<p>In our matrix representation, an indistinguishable pair of integers correspond to a pair of equal columns. If there are $C$ cards, then there are $2^C$ possible distinct columns. Therefore, $C$ cards can identify at least $2^C$ different integers. Thus, we have shown that you don't need more than $\lceil \log_2 N \rceil$ cards to identify $N$ integers.</p>
<p>On the other hand, it's not possible to do it with strictly less than $\lceil \log_2 N \rceil$ cards, because there aren't enough distinct columns to identify $N$ integers. Therefore, we have shown that the answer is exactly $\lceil \log_2 N \rceil$ :)</p>
<p>To create a set of $C = \lceil \log_2 N \rceil$ such cards, simply create a $C\times N$ matrix with the property that no two columns are equal. For example, If $N = 14$, then $C = 4$. An example of such a matrix is the following:</p>
<pre><code>01010101010101
00110011001100
00001111000011
00000000111111
</code></pre>
<p>This matrix corresponds to the following sequence of cards:</p>
<pre><code>Card 1: 2 4 6 8 10 12 14
Card 2: 3 4 7 8 11 12
Card 3: 5 6 7 8 13 14
Card 4: 9 10 11 12 13 14
</code></pre>
<p>It's easy to see that this set of cards can identify all integers from $1$ to $14$ :) (Note that the integer $1$ appears in no cards, but it's the only such integer, so it can still be identified) Note that there are other sets of cards that also work.</p>
<h1>A slightly harder problem</h1>
<p>Now, let's add the following restriction:</p>
<ul>
<li>An integer cannot appear in two consecutive cards.</li>
</ul>
<p>Note that we're getting closer to the original problem. Also, note that the answer we get by solving this problem is automatically a lower bound for the answer to the original problem.</p>
<p>To answer this problem, we first convert the new restriction in terms of our matrix representation. An integer appears in two consecutive cards if and only if its bit string (i.e. the bit sequence in the column corresponding to it) contains two consecutive on bits. Thus, we cannot use all $2^C$ such bit strings for the columns; we can only use those not containing two consecutive $1$s.</p>
<p>How many such bit strings are there? Let $f(N)$ be the number of such strings of length $N$. Let's assume first that $N \ge 2$. Notice that such a string starts in a $0$ or a $1$. If it starts in a $0$, then the remaining bit string of length $N-1$ can be chosen among the $f(N-1)$ valid strings. If it starts in a $1$, then the next one should be a $0$, but the remaining bit string of length $N-2$ can be chosen among the $f(N-2)$ valid strings. Therefore, we have the following recurrence:
$$f(N) = f(N-1) + f(N-2)$$
Notice the similarity with the Fibonacci recurrence! In fact, for base cases, we have $f(0) = 1$ and $f(1) = 2$, which are also two consecutive Fibonacci numbers, so the sequence $(f(0), f(1), f(2), \ldots)$ is just the Fibonacci sequence offset by two places! To be more specific, we have:
$$f(N) = F_{N+2}$$
Therefore, a sequence of $C$ cards can identify $F_{C+2}$ integers, and thus the answer for a given $N$ is the smallest $C$ such that $F_{C+2} \ge N$.</p>
<p>For example, if $N = 15$, then $C = 6$, and one such $C\times N$ matrix is the following:</p>
<pre><code>010010100100101
001000010010000
000110000001100
000001110000000
000000001111100
000000000000011
</code></pre>
<p>This matrix corresponds to the following sequence of cards:</p>
<pre><code>Card 1: 2 5 7 10 13 15
Card 2: 3 8 11
Card 3: 4 5 12 13
Card 4: 6 7 8
Card 5: 9 10 11 12 13
Card 6: 14 15
</code></pre>
<h1>The original problem</h1>
<p>Finally, we can add the remaining restriction to get to the original problem:</p>
<ul>
<li>Each integer must be the sum of the first integer in all cards containing it.</li>
</ul>
<p>Remember that the answer is at least the answer to the previous problem, i.e. at least the smallest $C$ such that $F_{C+2} \ge N$. But it could be more than that. For example, each integer must appear at least once in some card (or else it cannot be equal to the sum of the cards it appears in, which would be $0$). Therefore, a column with a pure zero bit string is not allowed! We now have the following lower bound of the sum: the smallest $C$ such that $F_{C+2} > N$.</p>
<p>The key insight is to use <a href="https://en.wikipedia.org/wiki/Zeckendorf%27s_theorem"><strong>Zeckendorf's theorem</strong></a>: Each integer can be uniquely represented as the sum of Fibonacci numbers from $(F _ i) _ {i\ge 2}$ in such a way that the sum does not include any two consecutive Fibonacci numbers. To use this, we use a distinct Fibonacci number as the first number in each card, and place each integer $i$ in card $j$ if and only if $F _ {j+1}$ is in the Zeckendorf representation of $i$. For example, the integer $17$ should appear in cards $1$, $3$ and $6$ because $F _ {1+1} + F _ {3+1} + F _ {6+1} = 17$. Note that each Fibonacci number appears in exactly one card, and we place this number at the beginning of the list, so that each integer is exactly the sum of the first integer in the cards it appears in, satisfying the conditions of the problem!</p>
<p>It's easy to see that the solution described in the previous paragraph is optimal; i.e. the number of cards needed is exactly the lower bound that we have, i.e. the smallest $C$ such that $F_{C+2} > N$. Thus, we have shown that this is the answer!</p>
<p>Here's an implementation of this algorithm in Python:</p>
<pre><code>F = [0,1]
while F[-1] <= 10**20:
F.append(F[-2] + F[-1])
for cas in xrange(input()):
n = input()
c = 1
while F[c+2] <= n:
c += 1
print c
</code></pre>
<p>Note that this simply computes enough Fibonacci numbers, and computes the answer according to its definition. It's also possible to answer the problem without precomputing Fibonacci numbers, as shown in the following implementation:</p>
<pre><code>for cas in xrange(input()):
n, c, Fc1, Fc2 = input(), 1, 1, 2
while Fc2 <= n:
Fc1, Fc2, c = Fc2, Fc1+Fc2, c+1
print c
</code></pre>
<p>In this implementation, <code>Fc1</code> and <code>Fc2</code> corresponds to $F_{c+1}$ and $F_{c+2}$, respectively.</p>
<h1>Time Complexity:</h1>
<p>$O(\log N)$ </p>
<h1>AUTHOR'S AND TESTER'S SOLUTIONS:</h1>
<p><a href="http://www.codechef.com/download/Solutions/AUG15/Setter/ADMAG.cpp">setter</a><br>
<a href="http://www.codechef.com/download/Solutions/AUG15/Tester/ADMAG.cpp">tester</a> </p>kevinsogoSun, 16 Aug 2015 13:06:42 +0530https://discuss.codechef.com/questions/73723/admag-editorialsimpleaug15zeckendorfeditorialfibonacciSCLUSTER - Editorialhttps://discuss.codechef.com/questions/73728/scluster-editorial<h1>PROBLEM LINK:</h1>
<p><a href="http://www.codechef.com/AUG15/problems/SCLUSTER">Contest</a> <br>
<a href="http://www.codechef.com/problems/SCLUSTER">Practice</a> </p>
<p><strong>Author:</strong> <a href="https://www.codechef.com/users/nssprogrammer">Snigdha Chandan</a><br>
<strong>Tester:</strong> <a href="http://www.codechef.com/users/laycurse">Hiroto Sekido</a><br>
<strong>Editorialist:</strong> <a href="http://www.codechef.com/users/kevinsogo">Kevin Atienza</a></p>
<h1>DIFFICULTY:</h1>
<p>Challenge</p>
<h1>PREREQUISITES:</h1>
<p>Heuristics, facility location problem, simulated annealing, steiner tree</p>
<h1>PROBLEM:</h1>
<p>Roughly speaking, given an $N\times N$ grid with individuals $K$ in it and each having a number called its "<strong>interaction index</strong>", you are to move some (maybe none, or all) individuals so that all $K$ form a single connected component (where edge- and corner-adjacent individuals are considered adjacent). The goal is to minimize the value $1000A + 10B$, where $A$ is roughly the weighted sum of distances the individuals are moved, and $B$ is the "average" difference between interaction indices of all adjacent clients. (The exact formula for $A$ and $B$ are given in the problem statement)</p>
<h1>EXPLANATION:</h1>
<p>This problem was inspired by the "<strong>mobile facility location problem</strong>", which is a variant of the classical <a href="https://en.wikipedia.org/wiki/Facility_location_problem">facility location problem</a>. In the mobile facility location problem, each facility and client is assigned to a start location in a metric graph and the goal is to find a destination node for each client and facility such that every client is sent to a node which is the destination of some facility. The objective is to minimize the total distance clients and facilities travel.</p>
<p>In this problem ("<strong>social cluster</strong>"), we can consider all the individuals on the grid as clients and our objective is to form a connected cluster while minimizing the total travelling distance of all the clients. Here, instead of summing up the distances travelled by individual clients, we will sum up the weighted distances (the weight is the interaction index). Here we are also minimizing an additional parameter: the average difference between interaction indices of all the adjacent clients with a particular client.</p>
<p>It is an NP-hard problem, so it's really difficult to come up with the exact optimal solution. Therefore we will try to approximate the optimal solution. We will discuss some of the heuristics below. Take note that we are minimizing the following score: $\text{Score} = 1000A + 10B$, where $A$ and $B$ are two parameters defined in the problem statement. Roughly speaking, $A$ represents the cost incurred by moving individuals. The longer the distance, the higher the cost, but it's less costlier to move individuals with higher interaction indices. On the other hand, $B$ represents the "average" difference between interaction indices of all adjacent clients.</p>
<p>First of all, we can forget about $B$ and try to focus on parameter $A$, i.e. we will try to construct a connected cluster of by moving some of the clients. (After all, $A$ is (exactly) one hundred times more important than $B$.)</p>
<p>We can use "<strong>simulated annealing</strong>" approach as a heuristic in order to construct the connected component. We can also consider it as a special case of classical <a href="https://en.wikipedia.org/wiki/K-medians_clustering">K-median problem</a>.</p>
<p>We can also create a <strong>metric Steiner tree</strong> consisting of all the clients at maximal position as terminal nodes and all the clients at non-maximal positions as Steiner nodes. Already a very simple $2$-approximation algorithm exists for computing the metric Steiner tree.</p>
<p>Once we are somehow successfully done with constructing the cluster, we can focus on rearranging the positions of the clients while still maintaining the connectivity of the cluster in order to minimize $B$ as well as to minimize the overall score.</p>
<p>We have to handle this situation very carefully as some arrangement may minimize $B$ but may result in increasing $A$.</p>
<p>There are a lot of approaches for this problem, and some are more effective than others. Feel free to discuss your approaches below :)</p>
<h1>AUTHOR'S AND TESTER'S SOLUTIONS:</h1>
<p><a href="http://www.codechef.com/download/Solutions/AUG15/Setter/SCLUSTER.cpp">setter</a><br>
<a href="http://www.codechef.com/download/Solutions/AUG15/Tester/SCLUSTER.cpp">tester</a> </p>kevinsogoSun, 16 Aug 2015 13:43:47 +0530https://discuss.codechef.com/questions/73728/scluster-editorialheuristicaug15stimulatedfacility-locsteinertreeannealingeditorialchallengeSTETSKLX - Editorialhttps://discuss.codechef.com/questions/73727/stetsklx-editorial<h1>PROBLEM LINK:</h1>
<p><a href="http://www.codechef.com/AUG15/problems/STETSKLX">Contest</a><br>
<a href="http://www.codechef.com/problems/STETSKLX">Practice</a> </p>
<p><strong>Author:</strong> <a href="https://www.codechef.com/users/antoniuk1">Antoniuk Vasyl</a><br>
<strong>Tester:</strong> <a href="http://www.codechef.com/users/laycurse">Hiroto Sekido</a><br>
<strong>Editorialist:</strong> <a href="http://www.codechef.com/users/kevinsogo">Kevin Atienza</a></p>
<h1>DIFFICULTY:</h1>
<p>Medium-Hard</p>
<h1>PREREQUISITES:</h1>
<p>Binary search, unrooted trees, sliding range minimum query, divide and conquer</p>
<h1>PROBLEM:</h1>
<p>Given a weighted unrooted tree of $N$ elements, and two integers $L$ and $R$, find the minimum <em>Chef length</em> among all simple paths whose number of edges is between $L$ and $R$, inclusive. If there is no such path, output $-1$.</p>
<p>The <strong>Chef length</strong> of a path is the "median" of the list of weights of all edges in the path. If the path has an even length, the "median" is defined as the larger of the two middle elements.</p>
<h1>QUICK EXPLANATION:</h1>
<p>Binary search on the answer $V$. Thus, for a given $V$, we need to figure out whether there is a path of length in $[L,R]$ whose Chef length is $\le V$. In the following, $V$ is fixed.</p>
<p>Let's call an edge <strong>good</strong> if its weight is $\le V$. Call a path <strong>good</strong> if its Chef length is $\le V$. Thus a path is good if and only if the number of good edges is more than half the path's length. The following algorithm computes whether there is a good path or not.</p>
<p>Root the tree at an arbitrary node, and for each node $i$ and each length $l \in [1,R]$, define $G(i,l)$ as the maximum number of good edges of any downward path starting from $\text{parent}(i)$ and with length $l$. Define it as $-\infty$ if such a path (or $\text{parent}(i)$) doesn't exist. Also, define $H(i,l) = 2G(i,l) - l$. There are are $2NR$ such values, and all $G(i,l)$ and $H(i,l)$ can all be computed in $O(NR)$ time (using dynamic programming).</p>
<p>If there exists a pair $(i,l)$ with $L \le l \le R$ and $H(i,l) > 0$, then return true. Otherwise, try to find two nodes $i$ and $j$ and two lengths $a$ and $b$ such that $\text{parent}(i) = \text{parent}(j)$, $H(i,a) + H(j,b) > 0$ and $1 \le a, b \le R$ and $L \le a + b \le R$. If such values exist, return true. Otherwise, return false.</p>
<p>The last part could take $O(N^2 R^2)$ or $O(NR^2)$ time when implemented naïvely, but it can be computed in $O(NR)$ time using a few tricks, for example the usage of <a href="http://wcipeg.com/wiki/Sliding_range_minimum_query">sliding range maximum query</a> (in at least one algorithm).</p>
<h1>EXPLANATION:</h1>
<h1>Binary search</h1>
<p>When finding the minimum of something, one standard approach is to <strong>binary search the answer</strong>. In more detail, suppose $\phi$ is a <a href="https://en.wikipedia.org/wiki/Predicate_(mathematical_logic)">predicate</a>, and suppose we want to find the minimum $V$ satisfying $\phi(V)$, and suppose that for any given $V$ we can calculate (the truth-value of) $\phi(V)$ in time, say, $O(T)$. Then the following approach tries to calculate the minimum $V$ such that $\phi(V)$ is true:</p>
<pre><code>def minimize(phi):
Lo = some (small) value V
Hi = some (large) value V
if phi(Lo) is true or phi(Hi) is false:
return [failure]
while Hi - Lo > 1:
Md = (Lo + Hi) / 2
if phi(Md):
Hi = Md
else:
Lo = Md
return Hi
</code></pre>
<p>It's easy to see that the running time is $O(T \log (Hi - Lo))$, thus this approach only adds a "log factor" in the running time (which is small assuming $Hi - Lo$ is manageable).</p>
<p>However, this algorithm only works if the following conditions are satisfied:</p>
<ul>
<li>$\phi(V)$ is monotonic, i.e. if $\phi(V)$ is true, then $\phi(V+1)$ is also true.</li>
<li>We can find values $Lo$ and $Hi$ where $\phi(Lo)$ is false and $\phi(Hi)$ is true. There are ways to systematically find such values (if they exist), but thankfully in our current problem we can easily find such values $Lo$ and $Hi$, as will be described later.</li>
</ul>
<p>This is a nice approach, because we're reducing the problem of minimizing $V$ that satisfies $\phi(V)$, into a binary question: "given $V$, is $\phi(V)$ true?", while only adding a small running time factor.</p>
<p>Thus, we'll try to answer the following question: Given $V$, is there a path of length in $[L,R]$ whose Chef length is $\le V$? It's easy to see that this predicate is monotonic, and that the minimum value satisfying this is the answer to the problem. For this problem, we can use $Lo = 0$ and $Hi = \max c$. Obviously the predicate is false on $Lo$ ($0$ can't be a median because all edges are positive), but if the predicate is false on $Hi$, then this can only mean there is no path of length in $[L,R]$, so we output $-1$.</p>
<h1>Is there a path?</h1>
<p>Let's fix a $V$ and try to check whether there is a path of length in $[L,R]$ whose Chef length is $\le V$.</p>
<p><strong>Chef length</strong></p>
<p>Let's first go more in-depth on what the Chef length looks like.</p>
<p>Let's call an edge <strong>good</strong> if its weight is $\le V$. Call a path <strong>good</strong> if its Chef length is $\le V$ (thus we are checking whether there exists a good path in the tree with length in $[L,R]$). It's easy to see that <strong>a path is good if and only if the number of good edges is more than half the path's length</strong>. (we invite the reader to verify. Note that there are two cases, depending on the parity of the path length)</p>
<p>Next, let's suppose we have a path of length $l$ and has $g$ good edges ($0 \le g \le l$). Then this path is good if and only if $\frac{g}{l} > \frac{1}{2}$. This is equivalent to:
$$\begin{align*}
\frac{g}{l} &> \frac{1}{2} \\\
2g &> l \\\
2g - l &> 0
\end{align*}$$</p>
<p>This suggests that the expression $2g - l$ is important for any given path. Let's call this the path's <strong>goodness index</strong>. The goodness index has the following useful property: <strong>The goodness index of the concatenation two edge-disjoint simple paths is equal to the sum of the goodness indices of the two paths</strong> (we invite the reader to verify!).</p>
<p><strong>Unrooted tree</strong></p>
<p>When solving a problem related to unrooted trees, one usual strategy is to <strong>root the tree</strong>. This helps one to "orient" or "organize" the tree in some way. There are multiple choices for the root, each with its own advantages and disadvantages. In the current problem, we'll try to just root the tree arbitrarily and see whether we can find a fast solution (because it's cumbersome to find special nodes in the tree). So in the following, let's assume that the tree is rooted at $1$.</p>
<p>Once the tree is rooted, we now see what the "simple paths" look like. Specifically, there are two kinds of simple paths: </p>
<ul>
<li>A <strong>pure downward path</strong>, which is a path between two nodes, where one is an ancestor of the other.</li>
<li>An <strong>up-down path</strong>, which is a path between two nodes, where neither is an ancestor of the other.</li>
</ul>
<p>Furthermore, one can observe that a up-down path is simply the concatenation of two pure downward paths. Therefore, let's define the following function: $g(i,l)$, which is the maximum number of good edges of any downward path of length $l$ starting at node $i$ (if there is no such path, define it as $-\infty$). Furthermore, let's define $h(i,l) = 2g(i,l) - l$, which is the maximum goodness index of any downward path of length $l$ starting at node $i$. We will define this for all $1 \le i \le N$ and $1 \le l \le R$, so that there are $NR$ possible indices. We'll describe how to calculate the $g(i,l)$ and $h(i,l)$ later, but for now, assume that we already have those values.</p>
<p>For convenience, let's define an additional function, $G(i,l)$, defined as $g(\text{parent}(i),l)$. This function is convenient if we want to inspect paths that go down a particular child of the tree rooted at $\text{parent}(i)$. Define $H(i,l)$ to be $2G(i,l) - l$.
Notice that:</p>
<ul>
<li>There exists a good pure downward path of length in $[L,R]$ if and only if there is a node $i$ and a length $l \in [L,R]$ such that $h(i,l) > 0$. Once we have all the $h(i,l)$ values, this can then be computed in $O(NR)$ time by doing a single pass through the whole $h$ table.</li>
<li>There exists a good up-down path of length in $[L,R]$ if and only if there are two nodes $i$ and $j$ and two lengths $a$ and $b$ such that $\text{parent}(i) = \text{parent}(j)$, $1 \le a, b \le R$, $L \le a + b \le R$ and $H(i,a) + H(j,b) > 0$. (This is a bit more complicated than the previous case, but this essentially amounts to dissecting an up-down path as two pure downward paths, and calculating the goodness index of the up-down path as the sum of the goodness indices of two paths. We invite the reader to verify this carefully) Computing this quickly is not as straightforward either, and will also be described later.</li>
</ul>
<p><strong>Computing $g(i,l)$ (and $h(i,l)$ and $G(i,l)$ and $H(i,l)$)</strong></p>
<p>Let's focus on computing the $g(i,l)$s. It is straightforward to create a recurrence for $g(i,l)$. Let's define $\text{good}(i,j)$ as $1$ if the edge connecting $i$ and $j$ is good, and $0$ otherwise. Then:
$$g(i,l) = \max_{\text{$j$ is a child of $i$}} \text{good}(i,j) + g(j,l-1)$$
For the base case, we say that $g(i,0) = 0$ (this represents the empty path starting at $i$). Implementing this in a straightforward way results in an $O(NR)$-time algorithm :)</p>
<p>Once all $g(i,l)$ are computed, computing $h(i,l)$, $G(i,l)$ and $H(i,l)$ can then be computed straightforwardly in $O(NR)$ time also.</p>
<p><strong>Existence of a good up-down path of length in $[L,R]$</strong></p>
<p>This is the remaining missing part in our algorithm. A naïve algorithm to find the quadruple $(i,j,a,b)$ by definition runs in $O(N^2R^2)$ time (because there are $O(N^2R^2)$ such quadruples), which is clearly too slow.</p>
<p>To improve this, let's rephrase the problem a bit. Instead of checking whether there exists a quadruple $(i,j,a,b)$ such that $H(i,a) + H(j,b) > 0$, we just check whether the maximum value of $H(i,a) + H(j,b)$ among all quadruples $(i,j,a,b)$ is $> 0$. It's clear to see that these are equivalent, but the latter allows for some optimizations which will spare us from checking all quadruples.</p>
<p>Next, let's use the fact that $i$ and $j$ have the same parent. Let $p$ be a particular node (which will be the common parent of our $i$ and $j$), and let $c_1, \ldots, c_k$ be the children of $p$ in increasing order. Then we want to compute the maximum value of $H(c_I,a) + H(c_J,b)$ among all quadruples $(I,J,a,b)$ with $1 \le I < J \le k$.</p>
<p>Suppose we fix $J$, $a$ and $b$. We now want to find the maximum $H(c_I,a) + H(c_J,b)$ for all $1 \le I < J$. Since $H(c_J,b)$ is fixed, we simply want to maximize $H(c_I,a)$ for $1 \le I < J$. The key observation is that this can be computed by maintaining the running maximum of $H(c_I,l)$ for any $1 \le l \le R$. The following pseudocode illustrates it:</p>
<pre><code>m[1...R] # all initialized with -INF .
for J = 1...K:
# at this point, m[l] contains the maximum H(c_I,l) for all 1 <= I < J
for all valid pairs (a,b):
if m[a] + H(c_J,b) > 0:
return true
# update the m[i]
for l in 1...R:
m[l] = max(m[l], H(c_J,l))
</code></pre>
<p>It's can be seen that this pseudocode detects whether there exists a quadruple $(I,J,a,b)$ such that $H(c_I,a) + H(c_J,b) > 0$. But the running time is now $O(NR^2)$ which is a lot better than before. Unfortunately this is still too slow.</p>
<p>Let's try to shave off an $O(R)$ factor :D Focus on the following loop (which runs in $O(R^2)$)</p>
<pre><code>for all valid pairs (a,b):
if m[a] + H(c_J,b) > 0:
return true
</code></pre>
<p>Let's expand the for loop:</p>
<pre><code>for b in 1...R:
for a in 1...R:
if L <= a + b <= R:
if m[a] + H(c_J,b) > 0:
return true
</code></pre>
<p>For a fixed $b$, the $a$s we will consider are those satisfying $1 \le a \le R$ and $L \le a + b \le R$. This is equivalent to $\max(1, L-b) \le a \le R - b$. Therefore, the above double loop is equivalent to:</p>
<pre><code>for b in 1...R:
M = max{m[a] : max(1, L-b) <= a <= R-b}
if M + H(c_J,b) > 0:
return true
</code></pre>
<p>We've almost reached a better running time. All that remains is to compute $M$ quickly. Notice that $M$ is the maximum of a subarray of $m$. But this is a standard <a href="https://en.wikipedia.org/wiki/Range_minimum_query">range maximum query (RMQ)</a> problem! Thus, each $M$s can be computed in $O(\log R)$ using a segment tree (with an $O(R)$ preprocessing), for a total of $O(R \log R)$ running time.</p>
<p>But we can do even better! We can use the fact that in the ranges in the queries, i.e. $[\max(1,L-b),R-b]$, the left endpoints and right endpoints are decreasing (not necessarily strictly). This makes the problem solvable by the use of a <strong>sliding RMQ algorithm</strong>, which requires no preprocessing and $O(1)$ query time. Furthermore, the implementation is really simple and <a href="http://wcipeg.com/wiki/Sliding_range_minimum_query">requires a single deque</a>. So with the use of such a structure, the loop becomes $O(R)$, which was our target.</p>
<p>With that final optimization, answering the question "Given $V$, is there a path of length in $[L,R]$ whose Chef length is $\le V$?" can be done in $O(NR)$ time. With the binary search, the overall algorithm runs in $O(RN \log \max c)$ :)</p>
<h1>Binary search improvement</h1>
<p>Finally, we introduce a small optimization on the binary search. Notice that <strong>the answer must be one of the $N-1$ possible edge weights</strong>. Therefore, we can instead just <strong>binary search the answer from the sorted array of edge weights</strong>, so that the running time becomes $O(RN \log N)$ instead of $(RN \log \max c)$.</p>
<h1>Time Complexity:</h1>
<p>$O(R N \log \max c)$ or $O(R N \log N)$ </p>
<h1>AUTHOR'S AND TESTER'S SOLUTIONS:</h1>
<p><a href="http://www.codechef.com/download/Solutions/AUG15/Setter/STETSKLX.cpp">setter</a><br>
<a href="http://www.codechef.com/download/Solutions/AUG15/Tester/STETSKLX.cpp">tester</a> </p>kevinsogoSun, 16 Aug 2015 13:07:30 +0530https://discuss.codechef.com/questions/73727/stetsklx-editorialbinary-searcheditorialminimum-queryaug15medium-harddivide-and-conqunrooted-treessliding-rangeCOOKMACH - Editorialhttps://discuss.codechef.com/questions/73724/cookmach-editorial<h1>PROBLEM LINK:</h1>
<p><a href="http://www.codechef.com/AUG15/problems/COOKMACH">Contest</a><br>
<a href="http://www.codechef.com/problems/COOKMACH">Practice</a> </p>
<p><strong>Author:</strong> <a href="https://www.codechef.com/users/cenadar">Maksym Bevza</a><br>
<strong>Tester:</strong> <a href="http://www.codechef.com/users/laycurse">Hiroto Sekido</a><br>
<strong>Editorialist:</strong> <a href="http://www.codechef.com/users/kevinsogo">Kevin Atienza</a></p>
<h1>DIFFICULTY:</h1>
<p>Cakewalk</p>
<h1>PREREQUISITES:</h1>
<p>Ad hoc, bitwise operators</p>
<h1>PROBLEM:</h1>
<p>You need to convert an integer $A$ to integer $B$, where both are positive and $B$ is a power of two. The allowed moves are the following (if the current number is $v$):</p>
<ul>
<li>If $v$ is even, replace it with $v/2$; if $v$ is odd, replace it with $(v-1)/2$.</li>
<li>Replace $v$ with $2v$.</li>
</ul>
<p>What is the minimum number of moves to do this? (it can be shown that this is always possible!)</p>
<h1>QUICK EXPLANATION:</h1>
<p>The fastest way is the following:</p>
<ul>
<li>Use the first step repeatedly until the number becomes a power of two.</li>
<li>If the current number is larger than $B$, use the first step even further until the number becomes $B$. Otherwise, use the second step repeatedly until the number becomes $B$.</li>
</ul>
<p>The number of steps this takes is the answer.</p>
<h1>EXPLANATION:</h1>
<p>We will make use of the fact that the target integer, $B$, is a power of two. First, let's notice what the operations really do.</p>
<p>The first operation roughly "halves" $v$, and the second operation doubles $v$. Thus, it makes sense to consider the binary representation of $v$.</p>
<p>In fact, the operations are simply the <a href="https://en.wikipedia.org/wiki/Bitwise_operation#Logical_shift">Bitwise logical shift</a> operations! In other words, the first operation is simply a single right shift, and the second operation is a single left shift.</p>
<p>A <strong>shift</strong> is simply an operation that "shifts" the binary representation of a number. For example, shifting the binary integer "1001101" left once results in "10011010". Note that we use a "0" for new digit places. Also, when shifting right, the rightmost digit is discarded: for example, the right shift of "1001101" and "1001100" are both "100110". It's fairly straightforward to see why the operations described in the problem statement are simply shifts.</p>
<p>With this in mind, how do we quickly go from $A$ to $B$? Since $B$ is a power of two, it has exactly one $1$ in its binary representation. Notice that using either operation doesn't increase the number of $1$s in the binary representation of a number. Therefore, the first action we must do is to convert the initial number into one that contains a single $1$ in its binary representation (i.e. a power of two). To do this, we shift right, until it becomes a power of two.</p>
<p>Now that our number is already a power of two, we can simply shift left or right repeatedly until it becomes equal to $B$!</p>
<p>The answer is simply the total number of shifts in both steps combined, and it's quite easy and intuitive to see why this is the optimal solution.</p>
<p>The following is an implementation in C++:</p>
<pre><code>#include <stdio.h>
#include <stdlib.h>
int main() {
int z;
scanf("%d", &z);
while (z--) {
int a, b;
scanf("%d%d", &a, &b);
int steps = 0;
while ((a & -a) != a) a >>= 1, steps++;
while (a < b) a <<= 1, steps++;
while (a > b) a >>= 1, steps++;
printf("%d\n", steps);
}
}
</code></pre>
<h1>A somewhat faster solution</h1>
<p>We can use a few more <a href="https://en.wikipedia.org/wiki/Bitwise_operation">bitwise</a> tricks to compute the answer without performing the steps themselves. Notice that there are two parts in calculating the answer:</p>
<ul>
<li>Use the first step repeatedly until the number becomes a power of two.</li>
<li>If the current number is larger than $B$, use the first step even further until the number becomes $B$. Otherwise, use the second step repeatedly until the number becomes $B$.</li>
</ul>
<p>How can we calculate the number of steps required in part one? Note that the answer is simply one more than the position from the right of the second most significant bit of $A$, or zero if there isn't such a bit (i.e. $A$ is a power of two already). We can compute this value quickly if we can perform the following operation quickly:</p>
<ul>
<li>Extract the position of the second largest bit of a number</li>
</ul>
<p>It can be dissected into a series of the following operations:</p>
<ul>
<li>Extract the largest bit of a number.</li>
<li>Remove a particular bit of a number.</li>
<li>Get the place value of a particular bit, i.e. given $v = 2^i$, compute $i$.</li>
</ul>
<p>Removing a particular bit is easy; the number <code>A</code> with the bit <code>b</code> removed is simply <code>A ^ b</code>, where <code>^</code> is the bitwise XOR operator. (This expression assumes that <code>b</code> is found in <code>A</code>. If you can't guarantee it, use the expression <code>A ^ ~b</code> instead, where <code>~</code> is the bitwise NOT operator) The first operation ("extract the largest bit") can be done with the following pseudocode which works for <a href="https://en.wikipedia.org/wiki/32-bit">32-bit</a> integers (we invite the reader to see why this works):</p>
<pre><code>int maxbit(int v) {
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
return v ^ (v >> 1);
}
</code></pre>
<p>Now, how do we implement the last one? Given $v = 2^i$, we want to compute $i$. Note that $i$ is simply <strong>the number of $1$ bits in the number $v-1$</strong>! But how do we count the $1$ bits in a given number? It can be done with the following pseudocode (once again, we invite the reader to see why this works):</p>
<pre><code>int bitcount(int v) {
v = (v & 0x55555555) + ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
v = (v & 0x0f0f0f0f) + ((v >> 4) & 0x0f0f0f0f);
v = (v & 0x00ff00ff) + ((v >> 8) & 0x00ff00ff);
v = (v & 0x0000ffff) + ((v >> 16) & 0x0000ffff);
return v;
}
</code></pre>
<p>With these, we can now extract the position of the second largest bit of a number:</p>
<pre><code>int extractsecond(int v) {
v ^= maxbit(v);
if (v == 0) {
// v was originally a power of two
return 0;
} else {
return bitcount(maxbit(v) - 1) + 1;
}
}
</code></pre>
<p>Now, for the second part: assuming the current number $A$ is a power of two, how many steps do we need to convert it to $B$? This is very simple: it's just the difference between the positions of the bits of $A$ and $B$, i.e. <code>abs(bitcount(A-1) - bitcount(B-1))</code>! Therefore, we now have the following (somewhat faster) solution:</p>
<pre><code>#include <stdio.h>
#include <stdlib.h>
#define ll long long
int maxbit(int v) {
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
return v ^ (v >> 1);
}
int bitcount(int v) {
v = (v & 0x55555555) + ((v >> 1) & 0x55555555);
v = (v & 0x33333333) + ((v >> 2) & 0x33333333);
v = (v & 0x0f0f0f0f) + ((v >> 4) & 0x0f0f0f0f);
v = (v & 0x00ff00ff) + ((v >> 8) & 0x00ff00ff);
v = (v & 0x0000ffff) + ((v >> 16) & 0x0000ffff);
return v;
}
int extractsecond(int v) {
v ^= maxbit(v);
if (v == 0) {
// v was originally a power of two
return 0;
} else {
return bitcount(maxbit(v) - 1) + 1;
}
}
int main() {
int z;
scanf("%d", &z);
while (z--) {
int a, b;
scanf("%d%d", &a, &b);
int steps = extractsecond(a);
a >>= steps;
steps += abs(bitcount(a - 1) - bitcount(b - 1));
printf("%d\n", steps);
}
}
</code></pre>
<h1>Time Complexity:</h1>
<p>$O(\log A + \log B)$ or $O(\log \log A + \log \log B)$</p>
<h1>AUTHOR'S AND TESTER'S SOLUTIONS:</h1>
<p><a href="http://www.codechef.com/download/Solutions/AUG15/Setter/COOKMACH.cpp">setter</a><br>
<a href="http://www.codechef.com/download/Solutions/AUG15/Tester/COOKMACH.cpp">tester</a> </p>kevinsogoSun, 16 Aug 2015 13:06:55 +0530https://discuss.codechef.com/questions/73724/cookmach-editorialad-hocaug15bitwise-operatneditorialcakewalkDCGAME - Editorialhttps://discuss.codechef.com/questions/73726/dcgame-editorial<h1>PROBLEM LINK:</h1>
<p><a href="http://www.codechef.com/AUG15/problems/DCGAME">Contest</a> <br>
<a href="http://www.codechef.com/problems/DCGAME">Practice</a> </p>
<p><strong>Author:</strong> <a href="http://www.codechef.com/users/ma5termind">Sunny Agarwal</a><br>
<strong>Tester:</strong> <a href="http://www.codechef.com/users/laycurse">Hiroto Sekido</a><br>
<strong>Editorialist:</strong> <a href="http://www.codechef.com/users/kevinsogo">Kevin Atienza</a></p>
<h1>DIFFICULTY:</h1>
<p>Medium</p>
<h1>PREREQUISITES:</h1>
<p>stock span problem, stack, binary search, value compression, combinatorics</p>
<h1>PROBLEM:</h1>
<p>Given an array $[A_1, \ldots, A_N]$. Let $B$ be the list of maximums of all subarrays of $A$ ($B$ has length $\frac{N(N+1)}{2}$). $M$ games will be played, each with a single constraint of the form "$C$ $K$", where $C$ is either $<$, $=$, $>$, and $K$ is an integer. In a single game, players alternate turns, and in a move a player marks off an integer in $B$ satisfying the constraint "$C$ $K$", and the player with no more valid moves is the loser.</p>
<p>You have to determine the winner of each game.</p>
<h1>QUICK EXPLANATION:</h1>
<p>Each game depends only on the <a href="https://en.wikipedia.org/wiki/Parity_(mathematics)"><em>parity</em></a> of the number of integers satisfying the constraint. Thus, we only need to find the number of values in $B$ satisfying the constraint.</p>
<p>Let $L_i$ be the largest index $L_i < i$ such that $A_{L_i} \ge A_i$ (set $L_i = 0$ if no such value exists)</p>
<p>Let $R_i$ be the smallest index $R_i > i$ such that $A_{R_i} > A_i$ (set $R_i = N+1$ if no such value exists)</p>
<p>All $L_i$s and $R_i$s can be computed in $O(N)$ <a href="http://www.geeksforgeeks.org/the-stock-span-problem/">using a stack</a>.</p>
<p>There can be at most $N$ distinct values in $A$, say $[V_1, \ldots, V_M]$ with $M \le N$ (in sorted order). Let $f(j)$ be the number of subarrays in which $V_j$ is the maximum. Then $f(j) = \sum_{A_i = j} (R_i - i)(i - L_i)$.</p>
<p>There are three kinds of constraints "$< K$", "$= K$" and "$> K$". we need to know the sum of the $f(j)$ for all $j$ in which $V_j$ satisfies the constraint. We introduce a fourth kind of constraint, "$\le K$". The other three kinds of constraints can be reduced to this.</p>
<p>To answer a "$\le K$" constraint, first find the largest index $j$ such that $V_j \le K$ (with binary search). Then the result we want is $f(1) + \cdots + f(j)$. This number can quickly be obtained if we precompute all prefix sums beforehand.</p>
<h1>EXPLANATION:</h1>
<p>The solution has multiple parts. We start with the most obvious questions first, so that we know what to precompute, etc.</p>
<h1>A "game"</h1>
<p>Let's first discuss how a game occurs. Suppose we have the list $B$, which contains the $\frac{N(N+1)}{2}$ maximums among all subarrays (of course, we can't construct this array in the harder subtasks because of its size). The first thing to notice is that the order of $B$'s elements doesn't matter. Thus, we can assume that $B$ is sorted.</p>
<p>Let's first assume that we have a constraint. The key thing to notice is that the game is really simple: the players simply alternate turns marking off numbers in $B$ satisfying the constraint. In fact, the winner is solely dependent on the number of values in $B$ satisfying the constraint, i.e., it doesn't matter whatever strategy both players are using! To be specific:</p>
<ul>
<li>If there are an even number of such numbers, then the second player always moves last.</li>
<li>If there are an odd number of such numbers, then the first player always moves last.</li>
</ul>
<p>Therefore, we only need to count those elements of $B$ satisfying the constraint! Because $B$ is sorted and because of the nature of the constraints, this number can be computed using one or two binary searches!</p>
<p>To illustrate, let $I(K)$ be the number of elements of $B$ that are $\le K$ (or simply the index of the largest $i$ such that $B_i \le K$, which can be computed with a binary search). Then:</p>
<ul>
<li>The number of elements of $B$ that are $< K$ is $I(K-1)$.</li>
<li>The number of elements of $B$ that are $= K$ is $I(K) - I(K-1)$.</li>
<li>The number of elements of $B$ that are $> K$ is $\frac{N(N+1)}{2} - I(K)$.</li>
</ul>
<p>Thus, if we already have the array $B$, then we can easily compute the result of any game in $O\left(\log \left(\frac{N(N+1)}{2}\right)\right) = O(\log N)$ time.</p>
<h1>Run-length encoding</h1>
<p>Unfortunately, the previous approach uses up $O(N^2)$ memory because of the array $B$. But we can reduce this significantly by noticing that <strong>there are only at most $N$ distinct values in $B$</strong>. (Why?) So let $[V_1, \ldots, V_M]$ be the distinct values in $B$, and let $f(i)$ be the number of occurrences of $V_i$ in $B$.</p>
<p>We can then compute $I(K)$ by doing a binary search in $V$ instead for the largest $i$ such that $V_i \le K$, and then $I(K)$ is simply $f(1) + f(2) + \cdots + f(i)$. If we precompute all prefix sums of the $f(i)$, then $I(K)$ can still be computed in $O(\log N)$ time, but the memory requirements decrease to just $O(N)$.</p>
<h1>Computing $V$ and $f(i)$</h1>
<p>To complete the algorithm, we need to compute the array $[V_1, \ldots, V_M]$ and the values $f(1), \ldots, f(M)$. Note that the sequence $V$ can be computed in $O(N \log N)$ time from the array $A$ (with a simple <a href="https://www.google.com/search?q=sort+uniquify&oq=sort+uniquify&aqs=chrome..69i57.2414j0j7&sourceid=chrome&es_sm=91&ie=UTF-8#safe=off&q=sort+uniq">sort + uniquify</a> operation). Thus, all that remains is to compute the $f(i)$s.</p>
<p>First, let's suppose that all elements of $A$ are distinct. Consider the value $A_i$. How many subarrays are there whose maximum is $A_i$? Well, let $L_i$ and $R_i$ are the nearest larger elements to the left and right of $A_i$, respectively. Then the subarray cannot contain either $A_{L_i}$ or $A_{R_i}$. Thus, the subarray should be strictly contained in $[A_{L_i+1}, \ldots, A_{R_i-1}]$, but due to the way $L_i$ and $R_i$ are defined, we see that the latter condition is sufficient. Thus, there are $(R_i - i)(i - L_i)$ possible subarrays (there are $(R_i - i)$ and $(i - L_i)$ ways to choose the right and left endpoints, respectively). Let us denote this quantity by $m(i)$.</p>
<p>As an example, consider the following image ($A_i$ is the height of the bar over the number $i$):</p>
<pre><code>#
# #
# # #
# # # #
# # # # #
# # # # # #
# # # # # # #
# # # # # # # #
1 2 3 4 5 6 7 8
</code></pre>
<p>In this case, suppose $i = 5$. Then $L_i = 1$ and $R_i = 8$. Thus, there are $(R_i - i) = 3$ and $(i - L_i) = 4$ choices for the right and left endpoints, respectively, for a total of $3\times 4 = 12$ subarrays, as shown below:</p>
<pre><code>1 2 3 4 5 6 7 8
(2 3 4 5)
(3 4 5)
(4 5)
(5)
(2 3 4 5 6)
(3 4 5 6)
(4 5 6)
(5 6)
(2 3 4 5 6 7)
(3 4 5 6 7)
(4 5 6 7)
(5 6 7)
</code></pre>
<p>As a consequence, we have $f(j) = m(i)$ for the (unique) $i$ such that $A_i = V_j$.</p>
<p>In case either $L_i$ or $R_i$ doesn't exist, we can use the values $0$ or $N+1$, respectively.</p>
<p>This formula, and the expression $m(i) = (R_i - i)(i - L_i)$ is nice; unfortunately it doesn't extend straightforwardly if the $A_i$s are not all distinct. To see why, consider the following example:</p>
<pre><code># # #
# # # # # #
# # # # # # #
# # # # # # # # #
1 2 3 4 5 6 7 8 9
</code></pre>
<p>The natural extension of $f(j) = m(i)$ would be $f(j) = \sum_{A_i = V_j} m(i)$. However, in the above case, it doesn't quite work. Consider $j = 3$. There are three $i$s with $A_i = 3$, namely $i = 3$, $i = 5$ and $i = 8$. But we have $m(3) = 6$, $m(5) = 6$ and $m(8) = 2$, so according to this formula this gives $f(3) = 6 + 6 + 2 = 14$. However, the correct value of $f(3)$ is $10\,$! (verify) Clearly we need to fix this formula.</p>
<p>We can do that by redefining $m(i)$ so that it counts the subarrays whose <strong>leftmost maximum</strong> is $A_i$. This uses the fact that each subarray has a <strong>unique</strong> leftmost maximum, so no subarray is double-counted. To implement this, redefine $L_i$ by weakening it so that $A_{L_i}$ can equal $A_i$ (i.e. $A_{L_i} \ge A_i$). Then $m(i)$ stays the same, i.e. $(R_i - i)(i - L_i)$. In the example above, we get the new values $m(3) = 6$, $m(5) = 2$ and $m(8) = 2$, so $f(3)$ is correctly computed as $6 + 2 + 2 = 10$.</p>
<p>But how do we compute the $L_i$s and $R_i$s? Actually, these values can be computed easily in $O(N)$ time by means of a stack: it's actually a standard problem called the <a href="http://www.geeksforgeeks.org/the-stock-span-problem/">stock span problem</a>. (Recognizing this as the stock span problem requires experience, but if you didn't know about this, notice that the problem of computing the $L_i$s and $R_i$s can be done in $O(N \log N)$ time with a segment tree)</p>
<p>Now, all that remains is to compute the $f(i)$s given the $m(i)$s. But this can be done easily in linear time also:</p>
<ul>
<li>Create an array for $f$, all initialized to zero.</li>
<li>Create an inverse map of $V$, say $\phi$, where $\phi(V_i) = i$.</li>
<li>For every $1 \le i \le N$, add the value $m(i)$ to $f(\phi(A_i))$.</li>
</ul>
<p>That's it! We now have all the parts needed for the whole solution. The preprocessing takes $O(N \log N)$ time, and each query can be answered in $O(\log N)$ time! The following is an example implementation in C++: </p>
<pre><code>#include <stdio.h>
#include <stdlib.h>
#include <algorithm>
using namespace std;
#define ll long long
#define LIM 1001111
#define INF (1<<30)
int A[LIM];
int L[LIM];
int R[LIM];
int s[LIM]; // stack
typedef pair<int,int> ct;
#define value first
#define count second
ct cts[LIM];
char typ[11];
char ans[LIM]; // will contain the answer string
int n, m;
int find(int k) {
// binary search
int L = 0, R = n + 1;
while (R - L > 1) {
int M = L + R >> 1;
(cts[M].value <= k ? L : R) = M;
}
return cts[L].count;
}
int main() {
scanf("%d%d", &n, &m);
A[0] = A[n+1] = INF;
for (int i = 1; i <= n; i++) scanf("%d", A + i);
// compute L from left to right
s[0] = 0;
for (int q = 0, i = 1; i <= n; i++) {
while (A[s[q]] < A[i]) q--;
L[i] = s[q];
s[++q] = i;
}
// compute R from right to left
s[0] = n+1;
for (int q = 0, i = n; i; i--) {
while (A[s[q]] <= A[i]) q--;
R[i] = s[q];
s[++q] = i;
}
// compute the frequencies of maximums of subarrays in sorted order.
cts[0].value = -INF;
cts[0].count = 0;
for (int i = 1; i <= n; i++) {
cts[i].value = A[i];
cts[i].count = (R[i] - i) * (i - L[i]);
}
sort(cts, cts + n + 1);
// compute cumulative sums. Since we only need the parity, we can use '^' instead of '+'
for (int i = 1; i <= n; i++) {
cts[i].count ^= cts[i-1].count;
}
// answer queries
for (int i = 0; i < m; i++) {
int k;
scanf("%s%d%s", typ, &k, ans + i);
if (!((*typ == '>' ? n*(n+1LL)/2 - find(k) : *typ == '<' ? find(k-1) : find(k) - find(k-1)) & 1)) ans[i] ^= 7;
}
ans[m] = 0;
printf("%s\n", ans);
}
</code></pre>
<p>As always, if you have any questions, feel free to ask :)</p>
<h1>Time Complexity:</h1>
<p>$O((N + M) \log N)$</p>
<h1>AUTHOR'S AND TESTER'S SOLUTIONS:</h1>
<p><a href="http://www.codechef.com/download/Solutions/AUG15/Setter/DCGAME.cpp">setter</a><br>
<a href="http://www.codechef.com/download/Solutions/AUG15/Tester/DCGAME.cpp">tester</a> </p>kevinsogoSun, 16 Aug 2015 13:07:37 +0530https://discuss.codechef.com/questions/73726/dcgame-editorialbinary-searchmediumcombinatoricsaug15valuestock-spaneditorialstackCHINSM - Editorialhttps://discuss.codechef.com/questions/73725/chinsm-editorial<h1>PROBLEM LINK:</h1>
<p><a href="http://www.codechef.com/AUG15/problems/CHINSM">Contest</a><br>
<a href="http://www.codechef.com/problems/CHINSM">Practice</a> </p>
<p><strong>Author:</strong> <a href="http://www.codechef.com/users/zedthirtyeight">Vlad Mosko</a><br>
<strong>Tester:</strong> <a href="http://www.codechef.com/users/laycurse">Hiroto Sekido</a><br>
<strong>Editorialist:</strong> <a href="http://www.codechef.com/users/kevinsogo">Kevin Atienza</a></p>
<h1>DIFFICULTY:</h1>
<p>Medium</p>
<h1>PREREQUISITES:</h1>
<p>Divisor lists, intervals, combinatorics, ad hoc</p>
<h1>PROBLEM:</h1>
<p>Given an array of integers $[A_1, \ldots, A_N]$ and a number $K$, find the number of subarrays containing no bad pairs. A pair $(i,j)$ is a <strong>bad pair</strong> if $i < j$ and $(A_i \bmod A_j) = K$.</p>
<h1>QUICK EXPLANATION:</h1>
<ul>
<li>For each position $j$, compute the largest $i_j < j$ such that $(i_j,j)$ is a bad pair. Collect all such bad pairs gotten this way, sorted by $j$. (There are at most $N$ such pairs, but may be less, because for some $j$s there could be no such $i_j$s).</li>
<li>Remove a pair $(i_j,j)$ if $j > 1$ and $i_{j-1} \ge i_j$.</li>
<li>Add the pair $(N, N+1)$ at the end of the list.</li>
<li>Assume that the list of pairs acquired is $[(i_1,j_1), (i_2,j_2), \ldots, (i_m,j_m)]$. Then the answer is:
$$\frac{N(N+1)}{2} - \sum_{n=1}^{m-1} i_n (j_{n+1} - j_n)$$</li>
</ul>
<p>To compute all pairs $(i_j,j)$, we need to store, for each value $v > K$, the maximum $i < j$ such that $A_i > K$ and $(A_i \bmod v) = K$ (there are at most $\max A_i$ such lists). We also store the maximum $l < j$ such that $A_l = K$. As we increase $j$, we need update some of these lists: When going to a new value of $j$ we only need to update at most $O(\sqrt{\max A_j})$ lists.</p>
<h1>EXPLANATION:</h1>
<p>Let's first assume that we have all the bad pairs, and we want to compute the number of subarrays containing no bad pairs. This is just equal to the number of subarrays (which is $N(N+1)/2$), minus the number of subarrays containing at least one bad pair. We will compute the latter instead, and we'll call a subarray <strong>bad</strong> if it contains at leaset one bad pair.</p>
<p>First, notice that all ${N \choose 2}$ pairs can be bad pairs (for example, if $A_i = 1$ for all $i$ and $K = 0$). Therefore we can't even compute all the bad pairs, because that is too slow. However, notice that not all bad pairs are important. For example, if the pair $p_1$ completely contains the pair $p_2$, then any subarray containing $p_1$ also contains $p_2$. Thus, the pair $p_1$ is redundant, i.e. if we remove the pair $p_1$ from our list, then the set of bad subarrays stay the same.</p>
<p>Thus we can remove all bad pairs completely containing other bad pairs. Note that after doing so, no two bad pairs share the same left (or right) endpoint any more (otherwise one will contain the other). Since there are at most $N$ endpoints, this reduces the number of bad pairs we have to <em>at most $N-1$</em>, which is more manageable :)</p>
<p>This also shows that for a given endpoint $j$, there is only at most one important bad pair $(i,j)$ ending in $j$, and it's the one with the maximum $i$. Therefore, the first step is to compute the maximum $i$ for each $j$ such that $(i,j)$ is bad (ignoring the $j$s for which there are no such $i$).</p>
<h1>Bad pairs ending in every position</h1>
<p>Assuming $A_j > K$, $(A_i \bmod A_j) = K$ is equivalent to:
$$\begin{align*}
(A_i \bmod A_j) &= K \\\
A_i &\equiv K \pmod{A_j} \\\
A_j &| (A_i - K)
\end{align*}$$
This means that $A_j$ is a divisor of $(A_i - K)$.</p>
<p>For every endpoint $j$, we need to compute the maximum $i$ such that $(i,j)$ is bad. Surely, there are no bad pairs such that $A_i < K$ or $A_j \le K$, because $(A_i \bmod A_j) < K$ in those cases.</p>
<p>Suppose that, for every value $v$, $K < v \le \max A_i$, we store the value $i_v$, which we define as the maximum $i_v < j$ such that $(A_{i_v} \bmod v) = K$. Note that $i_v$ is also dependent on $j$, so as we increase $j$, we will need to update some of the $i_v$s, but for now, assume we can do those updates. If so, the $i$ we are looking for is simply $i_{A_j}$ :)</p>
<p>Now for the updates. As we leave a particular value of $j$ and go to the next one ($j+1$), we need to update some of the $i_v$s, specifically for those $v$s having $(A_j \bmod v) = K$. As shown above, this is equivalent to $v$ being a divisor of $A_j - K$, so we only update for those $v$s that are divisors of the number $A_j - K$. This is good, because it has only at most $2\sqrt{A_j - K}$ such divisors :)</p>
<p>Unfortunately, the last statement is incorrect for the case $A_j = K$. If $A_j = K$, then $A_j - K = 0$. But every $v$ is a divisor of zero! In this case, updating all the lists may be too slow, so we need to do something else. This is simple; we can simply maintain a separate value $l$, which is the maximum $l < j$ such that $A_l = K$. $i_v$ is now modified to exclude those $i$s such that $A_i = K$, and for an endpoint $j$, the $i$ we are looking for is now $\max(i_{A_j}, l)$ :) Thus, we maintain the fast running time!</p>
<p>The total running time of this step is $O(N \sqrt{\max A_i})$.</p>
<p>After getting the bad pairs $(i_j,j)$ for every such $j$, we can simply remove those pairs for which $i_{j-1} \ge i_j$, because those pairs are not important. Now we have a list of bad pairs $[(i_1,j_1), (i_2,j_2), \ldots, (i_m,j_m)]$ satisfying the following:
$$i_1 < i_2 < \cdots < i_m$$
$$j_1 < j_2 < \cdots < j_m$$</p>
<h1>Subarrays containing bad pairs</h1>
<p>We now want to compute the number of subarrays containing at least one bad pair.</p>
<p>Each such subarray contains a unique "rightmost" bad pair, so we can compute, for each $1 \le k \le m$, the number of subarrays whose rightmost bad pair is $(i_k,j_k)$. Let's denote this by $f(k)$. Then the number of subarrays containing at least one bad pair is:
$$\sum_{k=1}^m f(k)$$</p>
<p>Let's first compute $f(m)$ (the last one). For a subarray $[i\ldots j]$ to contain $(i_m,j_m)$, it must be true that $1 \le i \le i_m$ and $j_m \le j \le N$. There are $i_m(N+1-j_m)$ such choices for such subarrays, therefore $f(m) = i_m(N+1-j_m)$.</p>
<p>Now, let's compute $f(k)$ for $1 \le k < m$. The subarray $[i\ldots j]$ should contain $(i_k,j_k)$, but we need to ensure that this is the rightmost bad pair, i.e. the subarray doesn't contain $(i_{k+1},j_{k+1})$. This means that $1 \le i \le i_k$ and $j_k \le j$. However, $j < j_{k+1}$ should be true, otherwise it will contain the pair $(i_{k+1},j_{k+1})$. Therefore, there are $i_k (j_{k+1} - j_k)$ choices for such subarrays, and thus we have $f(k) = i_k(j_{k+1} - j_k)$ :)</p>
<p>This step runs in $O(N)$ time :)</p>
<p>Here's an implementation in C++:</p>
<pre><code>#include <stdio.h>
#include <algorithm>
using namespace std;
#define ll long long
#define LIM 100111
int I[LIM];
int main() {
for (int j = 0; j < LIM; j++) I[j] = -1;
int n, k;
scanf("%d%d", &n, &k);
ll total = n*(n+1LL) >> 1;
int pi = 0, pj = 1;
#define add_pair(ci,cj) do {\
total += pi * (pj - (cj));\
pi = ci;\
pj = (cj);\
} while (0)
int l = 0;
for (int j = 1; j <= n; j++) {
int a;
scanf("%d", &a);
if (a == k) {
l = j;
} else if (a > k) {
int i = max(I[a], l);
if (pi < i) {
add_pair(i, j);
}
a -= k;
for (int v = 1; v * v <= a; v++) {
if (a % v == 0) I[v] = I[a / v] = j;
}
}
}
add_pair(n, n+1);
printf("%lld\n", total);
}
</code></pre>
<p>Some implementation details:</p>
<ul>
<li>We added a sentinel leftmost bad pair $(0,1)$ to make the <code>add_pair</code> macro simpler.</li>
<li>There is also a sentinel rightmost bad pair $(N,N+1)$ so that $f(m)$ (the last pair) is not special :)</li>
<li>There is no array $A$ allocated; instead the individual values of $A$ are inputted and processed on the fly.</li>
</ul>
<h1>Time Complexity:</h1>
<p>$O(N\sqrt{\max A_i})$ </p>
<h1>AUTHOR'S AND TESTER'S SOLUTIONS:</h1>
<p><a href="http://www.codechef.com/download/Solutions/AUG15/Setter/CHINSM.cpp">setter</a><br>
<a href="http://www.codechef.com/download/Solutions/AUG15/Tester/CHINSM.cpp">tester</a> </p>kevinsogoSun, 16 Aug 2015 13:07:14 +0530https://discuss.codechef.com/questions/73725/chinsm-editorialmediumad-hocaug15combinatoricsintervalseditorialdivisorswhat is the glitch in "way out" editorial ?https://discuss.codechef.com/questions/80483/what-is-the-glitch-in-way-out-editorial<p>I tried to solve <a href="https://www.codechef.com/problems/WOUT">this</a> problem but couldn't and it's <a href="https://discuss.codechef.com/questions/73722/wout-editorial">editorial</a> has just crossed my head.</p>
<p>please explain me this editorial.</p>
<p>I understood before this part but didn't get any thing after that part.</p>
<p>"The key to this is to notice that r<sub>i</sub> can be computed by a simple adjustment from r<sub>i−1</sub>! In other words, we can just calculate the difference r<sub>i</sub>−r<sub>i-1</sub>, and if we have already computed r<sub>i-1</sub>, then we can calculate r<sub>i</sub> by adding this difference. In more detail, let's try to compute r<sub>i</sub>−r<sub>i-1</sub>:"</p>arpit728Tue, 29 Mar 2016 06:55:52 +0530https://discuss.codechef.com/questions/80483/what-is-the-glitch-in-way-out-editorialprefix-sumsubarrayaug15dynamic-programmingWOUT - Editorialhttps://discuss.codechef.com/questions/73722/wout-editorial<h1>PROBLEM LINK:</h1>
<p><a href="http://www.codechef.com/AUG15/problems/WOUT">Contest</a><br>
<a href="http://www.codechef.com/AUG15/problems/WOUT">Practice</a> </p>
<p><strong>Author:</strong> <a href="http://www.codechef.com/users//witalij_hq">Vitalij Kozhukhivskij</a><br>
<strong>Tester:</strong> <a href="http://www.codechef.com/users/laycurse">Hiroto Sekido</a><br>
<strong>Editorialist:</strong> <a href="http://www.codechef.com/users/kevinsogo">Kevin Atienza</a></p>
<h1>DIFFICULTY:</h1>
<p>Easy</p>
<h1>PREREQUISITES:</h1>
<p>Dynamic programming, subarray sums, prefix sums, segment trees</p>
<h1>PROBLEM:</h1>
<p>There is a grid of $N\times N$ cells. The $N$ blocks of each column are made up of soil, except for a contiguous sequence of cells: from the $l_i$th cell to the $h_i$th cell (starting from the bottom, $0$-indexing). Cells can be cleared of soil.</p>
<p>We need to create a subgrid of length $N$ and height $H$ containing no soil. What is the minimum number of cells needed to be cleared of soil?</p>
<h1>QUICK EXPLANATION:</h1>
<p>There are only $N-H+1$ possible $N\times H$ subgrids. The answer is the the minimum number of soil cells among all such subgrids. We can preprocess our grid so we can compute the sum of each subgrid in constant time.</p>
<p>To preprocess:</p>
<ul>
<li>We need to know the number of soil cells in each row (denoted by $r_i$ for $0 \le i < N$). $r_i$ is equal to the number of $j$s such that $0 \le j < N$ and $i < l_j$ or $i > h_j$. This is also equal to the following expression:
$$ N - \#\{j : 0 \le j < N, i \le h_j\} + \#\{j : 0 \le j < N, i < l_j\} $$
Thus the $r_i$s can be computed quickly by considering the $l_j$s and $h_j$s in sorted order.</li>
<li>We also need to know the number of soil cells in each prefix of rows, i.e. let $s_i$ be the sum $r_0 + r_1 + \cdots + r_{i-1}$.</li>
</ul>
<p>Then the number of soil cells in each $N\times H$ subgrid can be computed as $s_i - s_{i-H}$ for $H \le i \le N$.</p>
<h1>EXPLANATION:</h1>
<p>Clearly, we need to find the $N\times H$ subgrid with the minimum number of soil cells in it. There are only $N-H+1$ such subgrids: the following illustrates the case $N = 7$ and $H = 3$ (there are $N-H+1 = 5$ subgrids):</p>
<pre><code>....... ####### ....... ....... ....... .......
....... ####### ####### ....... ....... .......
....... ####### ####### ####### ....... .......
....... -------> ....... ####### ####### ####### .......
....... ....... ....... ####### ####### #######
....... ....... ....... ....... ####### #######
....... ....... ....... ....... ....... #######
</code></pre>
<p>We can simply construct the $N\times N$ grid, and compute the number of soil cells in these subgrids. Since there are $N-H+1$ subgrids and each one has $NH$ cells to check, this algorithm runs in $O((N-H+1)NH)$ time. The worst case is when $H$ is around half of $N$ (which makes the running time $O(N^3)$), so unfortunately this algorithm is only good for the first subtask. For the second subtask, you can't even store the whole grid due to the memory requirements!</p>
<p>To answer the second subtask, we need a way to sum up these subgrids without constructing the whole grid. The first thing we notice is that the only information we need from its row is the number of soil cells in it, i.e. we don't need to know their positions in the row. Let's say the $i$th row ($0 \le i < N$) contains $r_i$ soil cells. Then the number of soil cells in each of the $N-H+1$ subgrids are the following:</p>
<ul>
<li>$r_0 + r_1 + r_2 + \cdots + r_{H-1}$ </li>
<li>$r_1 + r_2 + r_3 + \cdots + r_H$ </li>
<li>$r_2 + r_3 + r_4 + \cdots + r_{H+1}$ </li>
<li>$r_3 + r_4 + r_5 + \cdots + r_{H+2}$ </li>
<li>$\ldots$</li>
<li>$r_{N-H} + r_{N-H+1} + r_{N-H+3} + \cdots + r_{N-1}$ </li>
</ul>
<p>The smallest of these is the answer! Thus, it would be very helpful if we can compute the sequence $r_0, r_1, \ldots, r_{N-1}$ quickly, without constructing the whole grid!</p>
<p>To do so, we use the following observation: $r_i$ is equal to the number of $j$s such that $0 \le j < N$ and $i < l_j$ or $i > h_j$. Now, counting all such $j$s this way is still not fast enough, so we do some manipulations first. For a statement $\phi(j)$, let $C_{\phi(j)}$ be the number of $j$s such that $0 \le j < N$ and $\phi(j)$ is true. To familiarize yourself with this notation, we give a few basic facts (we invite you to verify each one):</p>
<ul>
<li>$C_{\text{true}} = N$</li>
<li>$C_{\text{false}} = 0$</li>
<li>$C_{\phi(j)} + C_{\text{not }\phi(j)} = C_{\text{true}} = N$</li>
<li>$C_{\phi(j)} = C_{\text{true}} - C_{\text{not }\phi(j)} = N - C_{\text{not }\phi(j)}$</li>
<li>$C_{\phi_1(j) \text{ or } \phi_2(j)} = C_{\phi_1(j)} + C_{\phi_2(j)} - C_{\phi_1(j) \text{ and } \phi_2(j)}$</li>
<li>$C_{c < f(j)} + C_{c = f(j)} + C_{c > f(j)} = N$ (trichotomy)</li>
<li>$C_{c < f(j)} + C_{c = f(j)} = C_{c \le f(j)}$</li>
<li>If $f_1(j) \le f_2(j)$ for all $j$, then $C_{f_1(j) \le c \le f_2(j)} = C_{c \le f_2(j)} - C_{c < f_1(j)}$</li>
</ul>
<p>Now, back to $r_i$. We have the following (using some of the facts above:</p>
<p>$$\begin{align*}
r_i
&= C_{i < l_j \text{ or } i > h_j} \\\
&= N - C_{\text{not } (i < l_j \text{ or } i > h_j)} \\\
&= N - C_{i \ge l_j \text{ and } i \le h_j} \\\
&= N - C_{l_j \le i \le h_j} \\\
&= N - (C_{i \le h_j} - C_{i < l_j})
\end{align*}$$
The last one is true because $l_j \le h_j$. Now, we have:
$$r_i = N - C_{i \le h_j} + C_{i < l_j}$$
To compute the $r_i$s, we just need to compute the quantity $- C_{i \le h_j} + C_{i < l_j}$ for all $0 \le i < N$.</p>
<p>The key to this is to notice that $r_i$ can be computed by a simple adjustment from $r_{i-1}$! In other words, we can just calculate the difference $r_i - r_{i-1}$, and if we have already computed $r_{i-1}$, then we can calculate $r_i$ by adding this difference. In more detail, let's try to compute $r_i - r_{i-1}$:
$$\begin{align*}
r_i - r_{i-1}
&= (N - C_{i \le h_j} + C_{i < l_j}) - (N - C_{i-1 \le h_j} + C_{i-1 < l_j}) \\\
&= N - C_{i \le h_j} + C_{i \le l_j-1} - N + C_{i \le h_j+1} - C_{i \le l_j} \\\
&= C_{i \le h_j+1} - C_{i \le h_j} + C_{i \le l_j-1} - C_{i \le l_j} \\\
&= (C_{i \le h_j+1} - C_{i \le h_j}) + (C_{i \le l_j-1} - C_{i \le l_j}) \\\
&= C_{i = h_j+1} - C_{i = l_j}
\end{align*}$$
But we can compute the values $C_{i = h_j+1}$ and $C_{i = l_j}$ for $0 \le i < N$ quickly, via a linear pass of all pairs $(l_j,h_j)$ for $0 \le j < N$! The following pseudocode does it:</p>
<pre><code># arrays are initialized with zeroes
Ch[0...N] # Ch[i] will contain C[i = h_j + 1]
Cl[0...N] # Ch[i] will contain C[i = l_j]
for j = 0...N-1:
Ch[h_j + 1] += 1
Cl[l_j] += 1
</code></pre>
<p>Now that all the $C_{i = h_j+1}$ and $C_{i = l_j}$ are computed, we can now compute all the $r_i$s using the following recurrence:
$$r_{-1} = N$$
$$r_i = r_{i-1} + C_{i = h_j+1} - C_{i = l_j}$$
The following pseudocode does it:</p>
<pre><code># array is initialized with zeroes
r[0...N]
curr = N
for i in 0...N-1:
curr += Ch[i] - Cl[i]
r[i] = curr
</code></pre>
<p>Clearly, these pseudocodes run in $O(N)$ time!</p>
<p>Finally, to compute the answer, we need to know the following sums:</p>
<ul>
<li>$r_0 + r_1 + r_2 + \cdots + r_{H-1}$ </li>
<li>$r_1 + r_2 + r_3 + \cdots + r_H$ </li>
<li>$r_2 + r_3 + r_4 + \cdots + r_{H+1}$ </li>
<li>$r_3 + r_4 + r_5 + \cdots + r_{H+2}$ </li>
<li>$\ldots$</li>
<li>$r_{N-H} + r_{N-H+1} + r_{N-H+3} + \cdots + r_{N-1}$ </li>
</ul>
<p>and then compute the minimum among them. But this is easy! Notice that $r_i + r_{i+1} + \cdots + r_j$ is simply $(r_0 + \cdots + r_j) - (r_0 + \cdots + r_{i-1})$, so we can first try computing the <strong>prefix sums</strong>. Let $s_i$ be the sum $r_0 + r_1 + \cdots + r_{i-1}$. Then $r_i + r_{i+1} + \cdots + r_j$ is simply $s_{j+1} - s_i$. The $s_i$s can be computed in $O(N)$ too, because $s_i = s_{i-1} + r_{i-1}$, with the base case $s_0 = 0$. Afterwards, the sums we need are simply:</p>
<ul>
<li>$s_H - s_0$</li>
<li>$s_{H+1} - s_1$</li>
<li>$s_{H+2} - s_2$</li>
<li>$\ldots$</li>
<li>$s_N - s_{N-H}$</li>
</ul>
<p>Since all the steps of this algorithm runs is $O(N)$, the answer can thus be computed in $O(N)$ time in total!</p>
<p>The following is a sample implementation in Python:</p>
<pre><code>for cas in xrange(input()):
n, h = map(int, raw_input().strip().split())
row = [0]*(n+2)
for i in xrange(n):
a, b = map(int, raw_input().strip().split())
row[a+1] -= 1
row[b+2] += 1
row[0] = n
for i in xrange(n): row[i+1] += row[i]
for i in xrange(n): row[i+1] += row[i]
print min(row[i] - row[i-h] for i in xrange(h,n+1))
</code></pre>
<p>A few things to notice about this implementation:</p>
<ul>
<li>The pairs $(l_j, h_j)$ are never stored in an array: they are obtained from the input on the fly, processed, and thrown away immediately.</li>
<li>Instead of having two arrays for $Ch[i]$ and $Cl[i]$, we only use a single array containing $Ch[i] - Cl[i]$.</li>
<li>We reuse the same array <code>row</code> to contain the values $Ch[i] - Cl[i]$, $r_i$ and $s_i$. Furthermore, $r_i$ is stored in index $i+1$ of <code>row</code>.</li>
</ul>
<h1>Time Complexity:</h1>
<p>$O(N)$ </p>
<h1>AUTHOR'S AND TESTER'S SOLUTIONS:</h1>
<p><a href="http://www.codechef.com/download/Solutions/AUG15/Setter/WOUT.java">setter</a><br>
<a href="http://www.codechef.com/download/Solutions/AUG15/Tester/WOUT.cpp">tester</a> </p>kevinsogoSun, 16 Aug 2015 13:06:17 +0530https://discuss.codechef.com/questions/73722/wout-editorialeditorialdynamic-programmingaug15easyprefix-sumsubarrayCLOWAY - Editorialhttps://discuss.codechef.com/questions/73733/cloway-editorial<h1>PROBLEM LINK:</h1>
<p><a href="http://www.codechef.com/AUG15/problems/CLOWAY">Contest</a> <br>
<a href="http://www.codechef.com/problems/CLOWAY">Practice</a> </p>
<p><strong>Author:</strong> <a href="https://www.codechef.com/users/kaizer">Shchetinin Kirill</a><br>
<strong>Tester:</strong> <a href="http://www.codechef.com/users/laycurse">Hiroto Sekido</a><br>
<strong>Editorialist:</strong> <a href="http://www.codechef.com/users/kevinsogo">Kevin Atienza</a></p>
<h1>DIFFICULTY:</h1>
<p>Super Hard</p>
<h1>PREREQUISITES:</h1>
<p>Spectral graph theory, fast Fourier transform, linear algebra</p>
<h1>PROBLEM:</h1>
<p>The problem can be restated in terms of <a href="https://en.wikipedia.org/wiki/Strong_product_of_graphs"><strong>strong products of graphs</strong></a>.</p>
<p>You are given $T$ undirected graphs, where the $k$th graph has $N_k$ vertices (with no loops and multiple edges). You are also given $Q$ queries. In each query, you are given three numbers $L_i$, $R_i$, $K_i$, and your task is to find out the number of closed paths of length at most $K_i$ in the strong product of graphs numbered $L_i , L_i + 1 , \ldots , R_i$, modulo $1000000007$.</p>
<p>The <strong>strong product of two graphs</strong> $G_1 = \langle V_1,E_1\rangle$ and $G_2 = \langle V_2,E_2\rangle$, denoted $G_1\times G_2$, is defined as the graph with the vertex set $V_1\times V_2$ (pairs of nodes, one from $G_1$ and another from $G_2$), and in which there is an edge $((u_1, u_2), (v_1, v_2))$ iff one of the following is true:</p>
<ul>
<li>$(u_1, v_1) \in E_1 \text{ and } (u_2, v_2) \in E_2$</li>
<li>$u_1 = v_1 \text{ and } (u_2, v_2) \in E_2$</li>
<li>$(u_1, v_1) \in E_1 \text{ and } u_2 = v_2$</li>
</ul>
<p>The strong product is commutative and associative up to <a href="https://en.wikipedia.org/wiki/Graph_isomorphism">isomorphism</a>. (Note that "$F\times G$" is not the standard notation for strong products, but since we're not dealing with any other type of graph products, we'll use it here.)</p>
<h1>QUICK EXPLANATION:</h1>
<ul>
<li>Process the queries offline: sort the queries in nondecreasing order of $L_i$, and then $R_i$.</li>
<li>For each of the $T$ graphs, find its characteristic polynomial.</li>
<li>Precalculate the traces of $(G_i + I)^k$ for $1 \le i \le T$ and $0 \le k \le 10^4$ (where $(G_i + I)^k$ is the adjacency matrix of $G_i$ plus the identity matrix, raised to the power of $k$).</li>
<li>Let $\text{ans}(l,r,k)$ be the answer for the query $(l,r,k)$. For a fixed $l$ and $r$, compute $\text{ans}(l,r,k)$ for all $k$ using a single polynomial multiplication (which can be done with fast Fourier transforms). This is due to the following equation (which is a <strong>convolution</strong>):
$$\text{ans}(l,r,k) = k! \sum_{i=0}^k \frac{P(l,r,i)}{i!} \frac{(-1)^{k-i}}{(k-i)!}$$
where $P(l,r,k)$ is the product
$$\text{tr}[(G_l + I)^k]\text{tr}[(G_{l+1} + I)^k]\cdots \text{tr}[(G_r + I)^k]$$</li>
</ul>
<h1>EXPLANATION:</h1>
<p>An obvious starting point in solving the problem is understanding how to find the number of closed paths in a graph. Clearly this is a minimum requirement to solve the problem. :)</p>
<p>Since we will be focusing on adjacency matrices a lot, for any given graph $G$ we also denote its adjacency matrix by $G$. Which one we're referring to is usually clear from the context. Thus, $G_{i,j}$ means $1$ if $i$ and $j$ are connected, and $0$ otherwise.</p>
<h1>Counting closed paths</h1>
<p>Let's focus on some graph $G\langle V,E \rangle$. Suppose we want to compute the number of closed paths of length $k$. Let's generalize the problem a bit: Let's compute, for all pairs of nodes $(i,j)$, the number of paths from $i$ to $j$ of length $k$, which we'll denote by $P(k) _ {i,j}$. The number of closed paths is simply $\displaystyle \sum _ {i\in V} P(k) _ {i,i}$.</p>
<p>Let's start with the base case: $k = 0$. In this case, we trivially have $P(0) _ {i,i} = 1$ (namely the empty path), and $P(0) _ {i,j} = 0$ if $i \not= j$.</p>
<p>Okay, now let's assume $k > 0$. Since the last node is $j$, the node before that could have been any node $t$ that is a neighbor of $j$. Thus, we have the following recurrence:
$$P(k) _ {i,j} = \sum _ {\text{$t$ is a neighbor of $j$}} P(k-1) _ {i,t}$$
But we can rewrite the above recurrence in another way, by consider the adjacency matrix of $G$. We have:
$$P(k) _ {i,j} = \sum _ {t\in V} P(k-1) _ {i,t}G _ {t,j}$$
If we treat each $P(k)$ as a matrix (of dimension $|V|$), then in fact the above base case and recurrence is the same as the following:
$$P(0) = I_{|V|}$$
$$P(k) = P(k-1)\cdot G$$
This means that $P(k)$ is simply $G^k$ :-)</p>
<p>Now back to the original problem of this section. We are looking for:
$$\sum _ {i\in V} P(k) _ {i,i} = \sum _ {i\in V} (G^k) _ {i,i}$$
This is simply the sum of the diagonal entries of the matrix $P(k) = G^k$. The sum of the diagonal elements of a matrix $M$ is called the <strong>trace</strong> of $M$, denoted $\text{tr}[M]$. </p>
<p>The conclusion is that the number of closed paths of length $k$ in a graph with adjacency matrix $G$ is simply $\text{tr}[G^k]$ :)</p>
<h1>Spectrum, and trace of powers</h1>
<p>With this in mind, let's understand a bit of what $\text{tr}[A]$ means (aside from being the sum of the diagonal entries of $A$). We use a bit of linear algebra. Remember that multiplication of a matrix is equivalent to doing a linear transformation.</p>
<p>A nonzero (column) vector $v$ is called an <strong>eigenvector</strong> of $A$ if multiplying $A$ by $v$ doesn't change the direction of $v$, i.e. there exists a scalar $\lambda$ such that $Av = \lambda v$. In this case, the value $\lambda$ is called the <strong>eigenvalue</strong> associated with the eigenvector $v$.</p>
<p>Let's try to compute the eigenvalues of some $N\times N$ matrix $A$. Notice that an eigenvector $v$ satisfies (for some $\lambda$):
$$\begin{align*}
Av &= \lambda v \\\
\lambda v - Av &= 0 \\\
(\lambda I - A)v &= 0
\end{align*}$$
where $I$ is an identity matrix with the same dimensions as $A$. It is a basic result that the equation $Mv = 0$ has nonzero solutions $v$ if and only if the <a href="https://en.wikipedia.org/wiki/Determinant">determinant</a> of $M$ is zero. Therefore, a number $\lambda$ is an eigenvalue of $A$ if and only if it is a solution to the following equation:
$$\text{det}(\lambda I - A) = 0$$
But notice that the left hand side is actually a polynomial in $\lambda$ of degree $N\,$! (It's called the <strong>characteristic polynomial</strong> of $A$) Therefore, the eigenvalues of a matrix $A$ are simply the roots of the characteristic polynomial of $A$, and $A$ has exactly $N$ eigenvalues (counting multiplicity). The multiset of eigenvalues of $A$ is called the <strong>spectrum</strong> of $A$, denoted $\text{sp}(A)$.</p>
<p>Consider the characteristic polynomial $\text{det}(\lambda I - A)$. It's easy to see that the coefficient of $\lambda^N$ is one (i.e. the polynomial is monic). But what about the coefficient of $\lambda^{N-1}$? Of course, this is the negative of the sum of the roots of polynomials (i.e. the eigenvalues), but what is it in terms of the entries of $A$? If you try to expand $\text{det}(\lambda I - A)$ (e.g. for small $N$s), you can see that it is equal to the negative of the sum of the diagonal entries of $A$ (i.e. the trace of $A$)! Therefore, we have the following fact:</p>
<p><strong>The trace of a matrix is equal to the sum of its eigenvalues.</strong></p>
<p>This is helpful, because the eigenvalues have a few nice properties that will help us solve the problem. For example, if $\lambda$ is an eigenvalue of $A$, then $\lambda^k$ is an eigenvalue of $A^k$ (why?). So the following facts follow:</p>
<ul>
<li>The spectrum of $A^k$ is $\{\lambda^k : \lambda \in \text{sp}(A)\}$.</li>
<li>$\text{tr}[A^k] = \displaystyle\sum_{\alpha \in \text{sp}(A)} \alpha^k$</li>
</ul>
<h1>Closed paths in strong products</h1>
<p>Let us now turn to finding the number of closed paths of length $k$ some graph $F\times G$ which is the strong product of two graphs $F$ and $G$.</p>
<p>It is clear that we want to compute the trace of $(F\times G)^k$, (where $F\times G$ also denotes the adjacency matrix). In this case, we can show the following fact:</p>
<hr>
<p><strong>The spectrum of $F\times G$ is equal to $\{(\alpha + 1)(\beta + 1) - 1 : \alpha \in \text{sp}(F), \beta \in \text{sp}(G)\}$</strong></p>
<hr>
<p><em>Proof</em>:</p>
<p>The rough idea is to construct an eigenvector of $F\times G$ for the eigenvalue $(\alpha + 1)(\beta + 1) - 1$, where $\alpha$ and $\beta$ are eigenvalues of $F$ and $G$. Notice that this value is equivalent to $\alpha\beta + \alpha + \beta$.</p>
<p>Let $v$ be $w$ be eigenvectors corresponding to the eigenvalues $\alpha$ and $\beta$ in $F$ and $G$, respectively. Then we have the following:
$$Fv = \alpha v$$
$$Gw = \beta w$$
The first is equivalent to (for every node $j$ of $F$):
$$\sum _ {\text{$i$ ~ $j$}} v _ i = \alpha v _ j$$
The second is equivalent to (for every node $j$ of $G$):
$$\sum _ {\text{$i$ ~ $j$}} w _ i = \beta w _ j$$
The eigenvector we are going to construct for the eigenvalue $\alpha\beta + \alpha + \beta$ is <a href="https://en.wikipedia.org/wiki/Tensor_product">$(v \otimes w)$</a>, or
$(v \otimes w) _ {i _ 1,i _ 2} = v _ {i _ 1}w _ {i _ 2}$
In other words, this vector consists of the products of all pairs of entries in $v$ and $w$.</p>
<p>Let us now check that this vector is indeed the eigenvector we are looking for. For a fixed node $(j_1,j_2)$ of $F\times G$:
$$\begin{align*}
\sum _ {\text{$(i _ 1,i _ 2)$ ~ $(j _ 1,j _ 2)$}} (v \otimes w) _ {i _ 1,i _ 2}
&= \sum _ {\text{$(i _ 1,i _ 2)$ ~ $(j _ 1,j _ 2)$}} v _ {i _ 1}w _ {i _ 2} \\\
&= \sum _ {\substack{\text{$i _ 1$ ~ $j _ 1$} \\\ \text{$i _ 2$ ~ $j _ 2$}}} v _ {i _ 1}w _ {i _ 2}
+ \sum _ {\text{$i _ 1$ ~ $j _ 1$}} v _ {i _ 1}w _ {j _ 2}
+ \sum _ {\text{$i _ 2$ ~ $j _ 2$}} v _ {j _ 1}w _ {i _ 2} \\\
&= \sum _ {\text{$i _ 1$ ~ $j _ 1$}} v _ {i _ 1} \sum _ {\text{$i _ 2$ ~ $j _ 2$}} w _ {i _ 2}
+ w _ {j _ 2} \sum _ {\text{$i _ 1$ ~ $j _ 1$}} v _ {i _ 1}
+ v _ {j _ 1} \sum _ {\text{$i _ 2$ ~ $j _ 2$}} w _ {i _ 2}
\end{align*}$$
The next step uses the fact that $v$ and $w$ are eigenvectors of their corresponding graphs:
$$\begin{align*}
\sum _ {\text{$(i _ 1,i _ 2)$ ~ $(j _ 1,j _ 2)$}} (v \otimes w) _ {i _ 1,i _ 2}
&= \alpha v _ {j _ 1} \beta w _ {j _ 2}
+ w _ {j _ 2} \alpha v _ {j _ 1}
+ v _ {j _ 1} \beta w _ {j _ 2} \\\
&= (\alpha \beta + \alpha + \beta) v _ {j _ 1} w _ {j _ 2} \\\
&= (\alpha \beta + \alpha + \beta) (v \otimes w) _ {j _ 1, j _ 2}
\end{align*}$$
This shows that $(v \otimes w)$ is an eigenvector corresponding to the eigenvalue $(\alpha \beta + \alpha + \beta)$.</p>
<hr>
<p>As a corollary, the trace of $(F\times G)^k$ is the following:</p>
<p>$$\text{tr}[(F\times G)^k] = \sum_{\alpha \in \text{sp}(F)} \sum_{\beta \in \text{sp}(G)} ((\alpha + 1)(\beta + 1) - 1)^k$$</p>
<p>When we expand this (using the binomial theorem), we get the following:
$$\begin{align*}
\text{tr}[(F\times G)^k]
&= \sum_{\alpha \in \text{sp}(F)} \sum_{\beta \in \text{sp}(G)} ((\alpha + 1)(\beta + 1) - 1)^k \\\
&= \sum_{\alpha \in \text{sp}(F)} \sum_{\beta \in \text{sp}(G)} \sum_{i=0}^k {k \choose i} (\alpha + 1)^i(\beta + 1)^i (-1)^{k-i} \\\
&= \sum_{i=0}^k {k \choose i} (-1)^{k-i} \sum_{\alpha \in \text{sp}(F)} (\alpha + 1)^i \sum_{\beta \in \text{sp}(G)} (\beta + 1)^i
\end{align*}$$
We will now use the fact that <strong>the spectrum of the matrix $A + cI$ is $\{\alpha + c : \alpha \in \text{sp}(A)\}$</strong> (why?):</p>
<p>$$\text{tr}[(F\times G)^k] = \sum_{i=0}^k {k \choose i} (-1)^{k-i} \text{tr}[(F + I)^i] \text{tr}[(G + I)^i]$$</p>
<p>This equation is very nice! (as we will see below)</p>
<h1>Towards a solution</h1>
<p>Now, let's try answering some query: given $(l,r,k)$, what is the number of closed paths of length $k$ in the graph $G_l\times G_{l+1}\times \ldots\times G_r$? Let's denote this value by $\text{ans}(l,r,k)$. In fact, we will be more ambitious and we'll try to solve the problem for all triples $(l,r,k)$ :) (a.k.a. the dynamic programming mindset) So, let $G(l,r)$ be the graph $G_l\times \ldots\times G_r$. So naturally, we have the following:
$$\text{ans}(l,r,k) = \text{tr}[G(l,r)^k]$$</p>
<p>But remember that $G(l,r) = G_l \times G_{l+1} \times \cdots \times G_r$, so we will need an extension of the equation above about $\text{tr}[(F\times G)^k]$. It's not hard to extend it into the following (verify):
$$\text{tr}[(G_1 \times \cdots \times G_t)^k] = \sum_{i=0}^k {k \choose i} (-1)^{k-i} \text{tr}[(G_1 + I)^i] \cdots \text{tr}[(G_t + I)^i]$$</p>
<p>So let's define $P(l,r,k)$ to be the product $\text{tr}[(G_l + I)^k]\text{tr}[(G_{l+1} + I)^k]\cdots \text{tr}[(G_r + I)^k]$. The values of $P(l,r,k)$ can be computed for all triples $(l,r,k)$ in $O(T^2K)$ time, assuming we have the values $\text{tr}[(G_i + I)^k]$ for all $1 \le i \le T$ and $k \le K$ (which we'll deal with later).</p>
<p>Then expanding $\text{tr}[G(l,r)^k]$, we get:
$$\begin{align*}
\text{ans}(l,r,k)
&= \text{tr}[G(l,r)^k] \\\
&= \text{tr}[(G_l \times \cdots \times G_r)^k] \\\
&= \sum_{i=0}^k {k \choose i} (-1)^{k-i} \text{tr}[(G_l + I)^i]\text{tr}[(G_{l+1} + I)^i]\cdots \text{tr}[(G_r + I)^i] \\\
&= \sum_{i=0}^k {k \choose i} (-1)^{k-i} P(l,r,i)
\end{align*}$$
This gives us a nice little formula to compute $G(l,r,k)$ for all $k$ in terms of $P(l,r,k)$ for all $k$!</p>
<p>Unfortunately, implementing this as it is will most likely time out, because there are $O(T^2K)$ such pairs, and each sum takes $O(K)$ time to compute, so the overall complexity is $O(T^2K^2)$. Luckily, we can exploit the beautiful structure of the above equation to form a faster algorithm. The first thing to do is to expand ${k \choose i}$ as $\frac{k!}{i!(k-i)!}$.
$$\begin{align*}
\text{ans}(l,r,k) &= \sum_{i=0}^k {k \choose i} (-1)^{k-i} P(l,r,i) \\\
\text{ans}(l,r,k) &= \sum_{i=0}^k \frac{k!}{i!(k-i)!} (-1)^{k-i} P(l,r,i) \\\
\text{ans}(l,r,k) &= k! \sum_{i=0}^k \frac{1}{i!(k-i)!} (-1)^{k-i} P(l,r,i) \\\
\text{ans}(l,r,k) &= k! \sum_{i=0}^k \frac{P(l,r,i)}{i!} \frac{(-1)^{k-i}}{(k-i)!}
\end{align*}$$
Notice that this is now in the form of a <a href="https://en.wikipedia.org/wiki/Convolution">convolution</a> of two sequences. Therefore, it seems feasible to use fast polynomial multiplication to compute these numbers quickly.</p>
<p>Indeed, let's fix the values $l$ and $r$ (with the intention of calculating $\text{ans}(l,r,k)$ for all $k \le K$). Also, let:
$$f(k) = \frac{P(l,r,k)}{k!}$$
$$g(k) = \frac{(-1)^k}{k!}$$
Then the above equality is the same as:
$$\text{ans}(l,r,k) = k! \sum_{i=0}^k f(i) g(k-i)$$
Consider two polynomials:
$$F(x) = \sum_{i=0}^K f(i) x^i$$
$$G(x) = \sum_{i=0}^K g(i) x^i$$
Then we have the remarkable property that <strong>$\text{ans}(l,r,k)$ is $k!$ times the coefficient of $x^k$ in the polynomial product $F(x)G(x)$</strong>! This allows us to calculate all $\text{ans}(l,r,k)$ for all $k$ using a single polynomial multiplication!</p>
<p>Multiplying two polynomials of degree $K$ can be done in $O(K^{1.586})$ time using Karatsuba's algorithm, or in $O(K \log K)$ time using fast Fourier transforms. (These are really standard topics and there are lots of resources online about these. For the current problem though, you will need to implement this really efficiently because the time limit is rather tight.)</p>
<p>Since we need to do $O(T^2)$ multiplications, the overall running time of this step is $O(T^2 K \log K)$.</p>
<p>The remaining thing to do is to compute $\text{tr}[(G_i + I)^k]$ for all $1 \le i \le T$ and $0 \le k \le K$, and for that, we will need a few additional tricks.</p>
<h1>Computing all traces $\text{tr}[(G_i + I)^k]$</h1>
<p>We will just focus on a single graph, say $G$, because we can just do the same procedure once for each of the $T$ graphs. Our goal is to compute $\text{tr}[I], \text{tr}[H], \text{tr}[H^2], \text{tr}[H^3], \ldots, \text{tr}[H^K]$, where $H := G + I$. Note that these values can be computed in $O(N^3K)$ time (where $N = |G|$), by simply calculating all powers of $H$ and computing the traces individually. However, this is too slow for our purposes, so we need to find a different way.</p>
<p>The key idea is to use the <strong>Cayley-Hamilton theorem</strong>:</p>
<p><strong>Let $c_0 + c_1 x + c_2 x^2 + \cdots + x^N$ be the characteristic polynomial of a matrix $H$. Then $c_0 I + c_1 H + c_2 H^2 + \ldots + H^N = 0$.</strong></p>
<p>(note that the $0$ in the right hand side is actually a zero matrix, not the number zero)</p>
<p>Thus, for our given matrix $H$, we have the following (multiplying by $H^{k-N}$ in both sides):
$$\begin{align*}
c_0 + c_1 H + c_2 H^2 + \ldots + H^N &= 0 \\\
c_0 H^{k-N} + c_1 H^{k-N+1} + c_2 H^{k-N+2} + \ldots + H^k &= 0 \\\
\text{tr}[c_0 H^{k-N} + c_1 H^{k-N+1} + c_2 H^{k-N+2} + \ldots + H^k] &= 0
\end{align*}$$
we can continue manipulating this by using the fact that the <strong>trace is a <a href="https://en.wikipedia.org/wiki/Linear_map">linear map</a></strong> (why?). Therefore,
$$c_0 \text{tr}[H^{k-N}] + c_1 \text{tr}[H^{k-N+1}] + c_2 \text{tr}[H^{k-N+2}] + \ldots + \text{tr}[H^k] = 0$$
Therefore, for any $N \le k \le K$, we can compute $\text{tr}[H^k]$ in $O(N)$ time assuming we have already computed the previous values! Thus, we can compute most of the values $\text{tr}[H^k]$ in $O(NK)$ time, which is significantly faster.</p>
<p>But to do so, we have to do the following two things:</p>
<ul>
<li>Compute $\text{tr}[I], \text{tr}[H], \text{tr}[H^2], \ldots, \text{tr}[H^N]$ (these are the base cases of the recurrence).</li>
<li>Compute the characteristic polynomial of $H$.</li>
</ul>
<p>The first one can just be done naïvely, i.e. compute the powers of $H$ and compute the traces individually, and it runs in $O(N^3\cdot N) = O(N^4)$ time which is acceptable, so the only remaining thing to do is to compute the characteristic polynomial of $H$.</p>
<h1>Finding the characteristic polynomial</h1>
<p>We want to compute the characteristic polynomial of $H$. By definition, it is $\text{det}(xI - H)$, which is a monic polynomial of degree $N$ whose roots are the members of $\text{sp}(H)$. Suppose the eigenvalues of $H$ are $\lambda_1, \lambda_2, \ldots, \lambda_N$. Then its characteristic polynomial is $(x - \lambda_1) (x - \lambda_2) \cdots (x - \lambda_N)$.</p>
<p>For instance, let's assume $N = 3$, and let's see what the characteristic polynomial looks like:
$$(x - \lambda_1) (x - \lambda_2) (x - \lambda_3) = x^3 - (\lambda_1 + \lambda_2 + \lambda_3)x^2 + (\lambda_1\lambda_2 + \lambda_1\lambda_3 + \lambda_2\lambda_3)x - \lambda_1\lambda_2\lambda_3$$</p>
<p>The expressions $(x_1 + x_2 + x_3)$, $(x_1x_2 + x_1x_3 + x_2x_3)$ and $(x_1x_2x_3)$ are called the <strong>elementary symmetric polynomials</strong> on variables $x_1$, $x_2$ and $x_3$, and these polynomials have a lot of other useful relations with a polynomial's coefficients and roots.</p>
<p>The <a href="https://en.wikipedia.org/wiki/Elementary_symmetric_polynomial">elementary symmetric polynomial</a> of order $k$ in variables $x_1, \ldots, x_N$ is denoted by $e_k(x_1, \ldots, x_N)$, and is defined as the sum of all products of all size-$k$ subsets of $\{x_1, \ldots, x_N\}$. For a given monic polynomial $p$, the coefficients of $p$ are the results of substituting the roots of $p$ in the elementary symmetric polynomials, so we have the following clear relationship between $p$'s coefficients and roots (also known as <a href="https://en.wikipedia.org/wiki/Vieta%27s_formulas">Vieta's formulas</a>):
$$e_1(\lambda_1, \ldots, \lambda_N) = -c_{N-1}$$
$$e_2(\lambda_1, \ldots, \lambda_N) = +c_{N-2}$$
$$e_3(\lambda_1, \ldots, \lambda_N) = -c_{N-3}$$
$$e_4(\lambda_1, \ldots, \lambda_N) = +c_{N-4}$$
$$\ldots$$
$$e_N(\lambda_1, \ldots, \lambda_N) = (-1)^Nc_0$$</p>
<p>Thus, we wish to compute these values $e_i(\lambda_1, \ldots, \lambda_N)$.</p>
<p>The key insight is to use the values $\text{tr}[I], \text{tr}[H], \text{tr}[H^2], \ldots, \text{tr}[H^N]$ (which we have already computed earlier). Remember that $\text{tr}[H^k]$ is just the sum of the $k$th powers of the eigenvalues of $H$, and there are relationships between elementary symmetric polynomials and power sums! They're known as <a href="https://en.wikipedia.org/wiki/Newton%27s_identities">Newton's identities</a> and can be stated as:
$$ke_k(x_1, \ldots x_N) = \sum_{i=1}^k (-1)^{i-1} e_{k-i}(x_1, \ldots, x_N)p_i(x_1, \ldots, x_N)$$
for all $k$. Here, $p_i(x_1, \ldots, x_N)$ is defined as $x_1^i + x_2^i + \cdots + x_N^i$. Substituting the $\lambda_i$s, we have $p_i(\lambda_1, \ldots, \lambda_N) = \text{tr}[H^i]$.</p>
<p>This allows us to compute the $e_k(\ldots)$s recursively, in $O(N^2)$ time!</p>
<h1>Optimizations</h1>
<p>Computing the necessary traces for each graph requires $O(N^4 + NK)$ time, thus doing it for all $T$ graphs requires $O(TN^4 + TNK)$ time. The second step (of using FFT to compute the $\text{ans}(l,r,k)$s) runs in $O(T^2K \log K)$. Finally, answering $Q$ queries simply require $Q$ lookups in the table for $\text{ans}$, so it takes $O(Q)$ time. Therefore, the overall running time is $O(TN^4 + TNK + T^2K\log K + Q)$, and the memory complexity is $O(T^2K)$ (due to storing the table $\text{ans}$).</p>
<p>However, we can actually improve the memory complexity by using the fact that <strong>the queries can be processed offline</strong>: we can process the queries by increasing $L_i$, and then by increasing $R_i$, so that we only store values of $\text{ans}(l,r,k)$ for two pairs $(l,r)$ values: $(l,r)$ and $(l,r-1)$. This reduces our memory requirements to just $O(TK)$ (remember that we still need to store the $O(TK)$ traces of the graphs).</p>
<h1>Time Complexity:</h1>
<p>$O(TN^4 + TNK + T^2K\log K + Q)$<br>
where $N = \max N_i$ and $K = \max K_i$</p>
<h1>AUTHOR'S AND TESTER'S SOLUTIONS:</h1>
<p><a href="http://www.codechef.com/download/Solutions/AUG15/Setter/CLOWAY.cpp">setter</a><br>
<a href="http://www.codechef.com/download/Solutions/AUG15/Tester/CLOWAY.cpp">tester</a> </p>kevinsogoMon, 17 Aug 2015 02:17:49 +0530https://discuss.codechef.com/questions/73733/cloway-editorialspectrallinearalgebragraphaug15super-hardeditorialfourierGRGUY - Editorialhttps://discuss.codechef.com/questions/73721/grguy-editorial<h1>PROBLEM LINK:</h1>
<p><a href="http://www.codechef.com/AUG15/problems/GRGUY">Contest</a><br>
<a href="http://www.codechef.com/problems/GRGUY">Practice</a> </p>
<p><strong>Author:</strong> <a href="http://www.codechef.com/users/ma5termind">Sunny Agarwal</a><br>
<strong>Tester:</strong> <a href="http://www.codechef.com/users/laycurse">Hiroto Sekido</a><br>
<strong>Editorialist:</strong> <a href="http://www.codechef.com/users/kevinsogo">Kevin Atienza</a></p>
<h1>DIFFICULTY:</h1>
<p>Simple</p>
<h1>PREREQUISITES:</h1>
<p>Greedy, dynamic programming</p>
<h1>PROBLEM:</h1>
<p>There are two lanes $L_1$ and $L_2$, each containing $N$ blocks, and $L_1$ is on top of $L_2$. Chef starts running from the beginning of (any) one of the lanes and must reach the end of any lane. To do so, Chef can use the following jumps (assuming he's currently at the $x$th block of some lane):</p>
<ul>
<li>Go to the $(x+1)$th block of the same lane.</li>
<li>Switch gravity and go to the $x$th block of the other lane.</li>
<li>Switch gravity and go to the $(x+1)$th block of the other lane.</li>
</ul>
<p>Some of the blocks are dirty, which means Chef cannot step over them.</p>
<p>You need to know whether Chef can reach the end and if so, what is the minimum number of <strong>gravity switches</strong> necessary.</p>
<h1>QUICK EXPLANATION:</h1>
<p>The second kind of jump is not helpful, so we only use the first and third. It's also not helpful to switch lanes unless there is an obstacle to avoid. With this in mind, there is one obvious optimal path that we should consider (i.e. switch lanes only if necessary):</p>
<ul>
<li>There is no path to the end path if and only if there is a column where both lanes contain dirty blocks.</li>
<li>If there is a lane containing no dirty blocks, then the minimum number of gravity switches is $0$.</li>
<li>Otherwise, start at the lane whose first dirty block appears later. Only switch lanes if the next block in the current lane is dirty. The number of steps taken is the answer.</li>
</ul>
<p>There's also a dynamic programming approach: let $D_1(x)$ and $D_2(x)$ be the fastest way to reach the $x$th block of lanes $L_1$ and $L_2$, respectively. Then:</p>
<ul>
<li>The answer is $\min(D_1(N), D_2(N))$ </li>
<li>We have the recurrences:
$$D_1(x) = \begin{cases}
\infty & \text{if $L_1(x)$ is dirty} \\\
\min(D_1(x-1), D_2(x-1)+1) & \text{otherwise}
\end{cases}$$
$$D_2(x) = \begin{cases}
\infty & \text{if $L_2(x)$ is dirty} \\\
\min(D_2(x-1), D_1(x-1)+1) & \text{otherwise}
\end{cases}$$</li>
</ul>
<p>We define $D_1(0) = D_2(0) = 0$ as base cases.</p>
<p>By making two arrays $D_1$ and $D_2$ of length $N$, we can compute these values by increasing $x$ in $O(N)$ time.</p>
<h1>EXPLANATION:</h1>
<p>Let's give some names to the different kinds of jumps.</p>
<ul>
<li><strong>Forward</strong>: Go to the $(x+1)$th block of the same lane.</li>
<li><strong>Quick switch</strong>: Switch gravity and go to the $x$th block of the other lane.</li>
<li><strong>Slow switch</strong>: Switch gravity and go to the $(x+1)$th block of the other lane.</li>
</ul>
<p>The first thing to notice is that "quick switch" is not helpful. Let's see why this is so, by considering the various possible jumps immediately after doing a quick switch. Let's assume that Chef is at position $L_1(x)$ (the case $L_2(x)$ is essentially the same).</p>
<ul>
<li>If the next jump is a "forward", then you went from $L_1(x)$ to $L_2(x+1)$ with one gravity switch. But this is just like doing a "slow switch"! If we do a single "slow switch" instead, the number of gravity switches stays the same, and we skipped passing through the block $L_2(x)$ (which is actually better in case $L_2(x)$ is dirty).</li>
<li>If the next jump is a "slow switch", then you went from $L_1(x)$ to $L_1(x+1)$ with two gravity switches. But we can instead do a single "forward" move without doing any gravity switches! (and we skipped passing through the block $L_2(x)$ which is helpful as explained in the previous bullet)</li>
<li>If the next jump is another "quick switch", then you just went back to your original position while incurring two gravity switches. Clearly, doing two quick switches in a row is just a waste of effort.</li>
</ul>
<p>Thus, we have shown that forwards and slow switches are the only kinds of jumps we have to consider.</p>
<h1>A greedy approach</h1>
<p>The remaining kinds of jumps have the property that each jump takes Chef one column to the right. In fact, we can derive a few more properties of optimal paths:</p>
<ul>
<li>It's not optimal to slow switch if there is no obstacle to avoid. More specifically, if you did a slow switch, then a sequence of $k$ forwards, and another slow switch, but there weren't any dirty blocks that were avoided in the process, then you can just replace that sequence with a sequence of $k+2$ forward switches. This saves you two gravity switches :)</li>
<li>It's always better to start at the lane whose first dirty block appears later, or doesn't appear at all. This is because if you start at the other lane, you are forced to do a slow switch which you could have avoided by simply starting at the other lane.</li>
</ul>
<p>So we now have the following <strong>greedy</strong> solution to the problem:</p>
<ul>
<li>There is no path to the end path if and only if there is a column where both lanes contain dirty blocks.</li>
<li>If there is a lane containing no dirty blocks, then the minimum number of gravity switches is $0$.</li>
<li>Otherwise, start at the lane whose first dirty block appears later, and simulate the path by trying to do only "forward" steps. Only use "slow switch" if the next block is dirty. The number of steps taken is the answer.</li>
</ul>
<h1>A dynamic programming approach</h1>
<p>Another standard way to approach the problem is to use the ever-powerful <strong>dynamic programming</strong>. For some people this is simpler, because it involves less thinking about the shape of the optimal path.</p>
<p>Let's define the <strong>distance</strong> of a block to be the minimum number of gravity switches needed to reach that block (in case the block is unreachable, we say the distance is $\infty$). Then let $D_1(x)$ and $D_2(x)$ be the distances of blocks $L_1(x)$ and $L_2(x)$, respectively.</p>
<p>Notice that the final answer is simply $\min(D_1(N), D_2(N))$, because we want to reach either $L_1(N)$ or $L_2(N)$ as fast as possible. Now, let's focus on $D_1(x)$.</p>
<p>If $L_1(x)$ is dirty, then of course there's no way to reach that cell (you can't even step on it!), so we immediately know that $D_1(x) = \infty$. Otherwise, the last move must have been either a "forward" or a "slow switch".</p>
<ul>
<li>If it was a forward, then you arrived at $L_1(x-1)$ to get to $L_1(x)$. The minimum number of gravity switches required for this is $D_1(x-1)$.</li>
<li>If it was a slow switch, then you arrived at $L_2(x-1)$ to get to $L_1(x)$. The minimum number of gravity switches required for this is $D_2(x-1) + 1$ (the $+1$ is due to the last move which uses one gravity switch).</li>
</ul>
<p>Therefore, we have the following:
$$D_1(x) = \begin{cases}
\infty & \text{if $L_1(x)$ is dirty} \\\
\min(D_1(x-1), D_2(x-1)+1) & \text{otherwise}
\end{cases}$$</p>
<p>The case is very similar for $D_2(x)$, i.e.:
$$D_2(x) = \begin{cases}
\infty & \text{if $L_2(x)$ is dirty} \\\
\min(D_2(x-1), D_1(x-1)+1) & \text{otherwise}
\end{cases}$$</p>
<p>These formulas enable us to compute $D_1(x)$ and $D_2(x)$ from $D_1(x-1)$ and $D_2(x-1)$ recursively!</p>
<p>But what about the base cases? We can compute $D_1(1)$ and $D_2(1)$ easily by considering that Chef can start at either $L_1(1)$ or $L_2(1)$ without doing any move (as long as the block is not dirty of course!). Thus:</p>
<p>$$D_1(1) = \begin{cases}
\infty & \text{if $L_1(1)$ is dirty} \\\
0 & \text{otherwise}
\end{cases}$$
$$D_2(1) = \begin{cases}
\infty & \text{if $L_2(1)$ is dirty} \\\
0 & \text{otherwise}
\end{cases}$$</p>
<p>But we can make the base case simpler by <strong>adding a clean block at the beginning of both lanes</strong>! This doesn't worsen the solution, because you can just start at the same lane you would have started originally, and do a "forward" move, without incurring additional gravity switches. Thus, we define two new blocks $L_1(0)$ and $L_2(0)$ as clean blocks, and define $D_1(0) = D_2(0) = 0$ as the base cases!</p>
<p>We now have all the ingredients for our solution. First, we build two ($0$-indexed) arrays, $D_1$ and $D_2$, each of length $N+1$. Then, set $D_1[0] = D_2[0] = 0$. Next, compute $D_1[x]$ and $D_2[x]$ for $1 \le x \le N$ in increasing order, based on the recurrences above. Finally, the answer is now $\min(D_1[N], D_2[N])$! (If this value is infinite, then the answer is <code>No</code>)</p>
<h1>Implementation details / optimizations</h1>
<p><strong>Handling $\infty$</strong></p>
<p>Notice that the value $\infty$ is used in our arrays $D_1$ and $D_2$. But most builtin types (such as <code>int</code> or <code>long</code>) do not have infinite values. Here are possible ways to handle that:</p>
<ul>
<li>Use a dummy value such as "$-1$" in place of infinity. I don't recommend this though, because you'll have to check for the value "$-1$" all the time (for example, you need to modify the <code>min</code> function because infinity should be larger than all other elements).</li>
<li>Define "infinity" to be some large value. Some possibilities are $10^9$, $2^{30}$, $2^{60}$ or the data type's upper limit. This has the advantage that comparisons of infinity against normal values are correct. However, one should be careful in comparing "infinite" values against each other. Also, be careful in doing arithmetic with this "infinity", because you might incur overflow!</li>
<li>Use the "float" or "double" data type which have infinite values.</li>
</ul>
<p>I prefer the second solution.</p>
<p><strong>Memory-efficient DP</strong></p>
<p>The solution above requires us to create two arrays, $D_1$ and $D_2$. However, notice that when computing $D_1[x]$ and $D_2[x]$, we only need the values of $D_1[x-1]$ and $D_2[x-1]$. Therefore, we only need to store the current and previous values of $D_1$ and $D_2$, reducing the memory requirements from $O(N)$ to $O(1)$ (disregarding the memory where the input is stored)!</p>
<h1>Sample implementations</h1>
<p>Finally, here we provide some implementations of the solution. The DP solutions use a large number for "infinity", and also use $O(1)$ additional memory.</p>
<p>Python (Greedy approach):</p>
<pre><code>INF = 1<<30 # some really large number
for cas in xrange(input()):
L1 = raw_input()
L2 = raw_input()
ans = 0
curr = 0
for L1x, L2x in zip(L1, L2):
if L1x == L2x == '#':
ans = INF
break
if L1x == '#':
if curr == 1:
ans += 1
curr = 2
if L2x == '#':
if curr == 2:
ans += 1
curr = 1
if ans >= INF:
print "No"
else:
print "Yes"
print ans
</code></pre>
<p>C++ (Greedy approach):</p>
<pre><code>#include <stdio.h>
#include <string.h>
#include <algorithm>
using namespace std;
#define LIM 200011
#define INF LIM<<3
char words[2][LIM];
int main() {
int z;
scanf("%d", &z);
while (z--) {
scanf("%s%s", words[0], words[1]);
int n = strlen(words[0]);
int curr = -1, ans = 0;
for (int i = 0; i < n; i++) {
bool dirty0 = words[0][i] == '#';
bool dirty1 = words[1][i] == '#';
if (dirty0 && dirty1) {
ans = INF;
break;
}
if (dirty0) {
if (curr == 0) ans++;
curr = 1;
}
if (dirty1) {
if (curr == 1) ans++;
curr = 0;
}
}
if (ans >= INF) {
printf("No\n");
} else {
printf("Yes\n%d\n", ans);
}
}
}
</code></pre>
<p>Python (DP approach):</p>
<pre><code>INF = 1<<30 # some really large number
for cas in xrange(input()):
L1 = raw_input()
L2 = raw_input()
D1 = D2 = 0
for L1x, L2x in zip(L1, L2):
D1, D2 = (
INF if L1x == '#' else min(D1, D2 + 1),
INF if L2x == '#' else min(D2, D1 + 1),
)
ans = min(D1, D2)
if ans >= INF:
print "No"
else:
print "Yes"
print ans
</code></pre>
<p>C++ (DP approach):</p>
<pre><code>#include <stdio.h>
#include <string.h>
#include <algorithm>
using namespace std;
#define LIM 200011
#define INF (LIM<<3)
char L[2][LIM];
int D[2];
int nD[2];
int main() {
int z;
scanf("%d", &z);
while (z--) {
scanf("%s%s", L[0], L[1]);
int n = strlen(L[0]);
D[0] = D[1] = 0;
for (int i = 0; i < n; i++) {
nD[0] = L[0][i] == '#' ? INF : min(D[0], D[1] + 1);
nD[1] = L[1][i] == '#' ? INF : min(D[1], D[0] + 1);
D[0] = nD[0];
D[1] = nD[1];
}
int ans = min(D[0], D[1]);
if (ans >= INF) {
printf("No\n");
} else {
printf("Yes\n%d\n", ans);
}
}
}
</code></pre>
<h1>Time Complexity:</h1>
<p>$O(N)$ </p>
<h1>AUTHOR'S AND TESTER'S SOLUTIONS:</h1>
<p><a href="http://www.codechef.com/download/Solutions/AUG15/Setter/GRGUY.cpp">setter</a><br>
<a href="http://www.codechef.com/download/Solutions/AUG15/Tester/GRGUY.cpp">tester</a> </p>kevinsogoSun, 16 Aug 2015 13:05:27 +0530https://discuss.codechef.com/questions/73721/grguy-editorialsimpledynamic-programmingaug15greedyeditorialDISTNUM - Editorialhttps://discuss.codechef.com/questions/74081/distnum-editorial<h1>PROBLEM LINK:</h1>
<p><a href="http://www.codechef.com/AUG15/problems/DISTNUM">Contest</a><br>
<a href="http://www.codechef.com/problems/DISTNUM">Practice</a> </p>
<p><strong>Author:</strong> <a href="http://www.codechef.com/users/mgch">Misha Chorniy</a><br>
<strong>Tester:</strong> <a href="http://www.codechef.com/users/laycurse">Hiroto Sekido</a><br>
<strong>Editorialist:</strong> <a href="http://www.codechef.com/users/kevinsogo">Kevin Atienza</a></p>
<h1>DIFFICULTY:</h1>
<p>Super Hard</p>
<h1>PREREQUISITES:</h1>
<p>Segment trees, Range trees, Self-balancing binary trees, 2D Fenwick trees, sqrt decomposition</p>
<h1>PROBLEM:</h1>
<p>Given an array $A$ consisting of $N$ positive integers, you have to process $Q$ queries. There are five types of queries:</p>
<ul>
<li><code>1 l r</code>: Let $S$ be the sorted <em>set</em> of elements with indices between $l$ and $r$. You need to find:
$$\left(\sum_{1 \le i < j < k \le |S|} S_iS_jS_k\right) \pmod{10^9+7}$$
where $S_i$ is the $i$th smallest element of $S$.</li>
<li><code>2 x y</code>: Assign the value $y$ to element at position $x$.</li>
<li><code>3 x</code>: Delete the element at position $x$ from the array.</li>
<li><code>4 x y</code>: Insert $y$ after element at position $x$. If $x = 0$, then $y$ is inserted before the first element.</li>
<li><code>5 l r</code>: Count the number of distinct numbers in array between indices $l$ and $r$.</li>
</ul>
<h1>QUICK EXPLANATION:</h1>
<ul>
<li>Instead of actually deleting elements, replace deleted elements with dummy "empty" or "null" values.</li>
<li>Preprocess the queries so that we can preallocate the whole array, and newly inserted values are inserted in the correct positions already, anticipating future insertions and deletions.</li>
<li>Preprocess the array so that range "power sum" queries for the exponents $0$ to $3$ can be computed quickly. For a given $l$, $r$ and $e$, the power sum $S(l,r,e)$ is the sum of the $e$th powers of distinct elements from $l$ to $r$. This should also take into account the dummy "empty" values. This part is somewhat well-known and is described in more detail below. Each such query can be answered in $O(\sqrt{N})$ or $O(\log^2 N)$ time.</li>
<li>The answer to a type 1 query <code>1 l r</code> is the following:
$$\frac{S_1^3 - 3S_1S_2 + 2S_3}{6}$$
where $S_e = S(l,r,e)$. $1/6$ can be computed using modular arithmetic.</li>
<li>The answer to a type 5 query <code>5 l r</code> is simply $S_0$.</li>
</ul>
<h1>EXPLANATION:</h1>
<p>There are multiple parts to the solution to this problem. We first describe a solution for a simplified version of the problem, and then gradually support additional types of queries.</p>
<h1>No updates</h1>
<p>When answering "range query" problems like this, it's sometimes helpful to ignore the updates first, i.e. try to answer the problem assuming there are no update queries. In our case, let's assume that there are only type 1 and 5 queries. It's clear that type 5 queries are easier, so let's handle them first.</p>
<p>Given a subrange $[l,r]$ of an array $A$, how many distinct elements are there? The answer could be anywhere from $1$ to $r-l+1$, depending on the duplicated elements in the subrange. We wish to count each distinct value only once, i.e. any value that appears again in the subrange must not be counted again.</p>
<p>When is a particular value $A_i$ counted? Obviously, we must have $l \le i \le r$. But in order not to double-count, we must ensure that this value is a new one, i.e. it hasn't appeared before. In other words, let $p_i$ be the largest index such that $p_i < i$ and $A_{p_i} = A_i$, or $p_i = 0$ if such a $p_i$ doesn't exist. Then it must be the case that $p_i < l$. Extending this to every element in the subrange, we see that the number of distinct elements in $A[l\ldots r]$ is the number of indices $i$ such that:</p>
<ul>
<li>$l \le i \le r$</li>
<li>$0 \le p_i < l$</li>
</ul>
<p>If for every $1 \le i \le N$, we plot a point $(i,p_i)$ in the 2D plane, then this is simply <strong>the number of points in the rectangle $[l,r]\times [0,l-1]$</strong>. But this is the standard problem of <strong>2D range queries</strong>, which are solvable with <strong>sqrt-decomposition</strong> or <a href="https://en.wikipedia.org/wiki/Range_tree"><strong>range trees</strong></a>. (we'll discuss these techniques in more detail later) This is great, because we can now answer type 5 queries!</p>
<p>Now, what about type 1 queries? At first glance, the formula seems convoluted, and there seems no way to simplify the sum. However, upon closer inspection, we can actually understand that this is simply <strong>the sum of all products of triplets of distinct elements in the range</strong>. Such a sum can be simplified into the following:
$$\frac{S_1^3 - 3S_1S_2 + 2S_3}{6}$$
where $S_i$ is the sum of the $i$th powers of all distinct elements in the range.</p>
<p>To understand this, note that $S_1^3$ is the sum of products of all triples, but it includes every triple six times (once for each permutation), and includes some bad triples with duplicate values. The remaining terms are simply adjustments to accommodate for these bad triples. For example, the sum $S_1S_2$ is the sum of products of all triples containing at least one duplicated value, and $S_3$ is the sum of products of all triples containing a triplicate value.</p>
<p>Therefore, to answer a type 1 query, we must be able to answer range queries of the type: "what is the sum of the $e$th powers of all distinct elements in the range?" where $e$ can be anything from $1$ to $3$. But we can modify the above approach to answer this! (in fact, the previous approach solves this in the special case $e = 0$) For a fixed $e$, we assign the value $A_i^e$ in each point $(i,p_i)$, and now the query becomes: <strong>compute the sum of values in the rectangle $[l,r]\times [0,l-1]$</strong>. The techniques mentioned above will still work with simple modifications :)</p>
<h1>Point updates</h1>
<p>Now let's tackle point updates, i.e. queries of type 2. Suppose we want to set $A_i$ to the value $v$, and that its previous value is $v_{\text{old}}$. To do so, we need to look deeper into the structure of the points $(i,p_i)$ in the 2D plane.</p>
<p>Fix a distinct value $v$, and let $i_1 < i_2 < \ldots < i_k$ be the distinct indices $i$ such that $A_i = v$ (let's call it <strong>$v$'s index sequence</strong>). Then we have the following points: $(i_1,0), (i_2,i_1), (i_3,i_2), \ldots, (i_k,i_{k-1})$.</p>
<p>When setting the value of $A_i$ from $v_{\text{old}}$ to $v$, we need to change some points. First, we need to find the index of $i$ in the index sequence of $v_{\text{old}}$, say $i_r$. Then when the value of $A_{i_r}$ is replaced, the points $(i_r,i_{r-1})$ and $(i_{r+1},i_r)$ will have to be removed. Furthermore, we also need to add the point $(i_{r+1},i_{r-1})$.</p>
<p>Now, when setting the value of $A_i$ to $v$, we need to find out where the index $i$ will be placed in the index sequence of $v$. Let's say $i_{r-1} < i < i_r$ for some $r$. Then we need to remove the point $(i_r, i_{r-1})$ and add the points $(i, i_{r-1})$ and $(i_r, i)$.</p>
<p>This means that there are only at most $6$ points that need to be updated: $3$ removed, $3$ added (maybe even less when the index $i$ is the first and last in the corresponding index sequence). Our sqrt-decomposition or 2D range tree approaches can be updated to accommodate these kinds of updates (which we'll also discuss later below).</p>
<p>There's also the question of how to maintain the index sequences of all distinct values. But this can be done by maintaining a <strong>binary search tree</strong> for every distinct value.</p>
<h1>Deletions</h1>
<p>Now, let's try handling deletions. Suppose we want to remove $A_i$ from the sequence. If we simply remove this value from the sequence, then the indices of all later elements of the sequence will need to be adjusted (for example $A_{i+1}$ becomes $A_i$, etc.). But this means <strong>updating up to $O(N)$ points</strong>, which would be too slow to pass the time limit. Thus, we need to find another way.</p>
<p>The key idea is to simply not remove $A_i$ from the sequence, and instead replace it with a dummy "null" value. We consider this "null" value to be just like any other value, but the difference is that it does not have its own index sequence (and thus no points). Using this, we don't have to update the indices of the later elements of the sequence!</p>
<p>Unfortunately, this means that the indices of range queries $[l,r]$ will need to be updated (because some of the indices are now null values). Specifically, each index $i$ needs to be replaced with a new index, which we denote by $f(i)$, which is the index of the $i$th non-null value of the sequence. Fortunately, such an index can be computed by maintaining a <strong>segment tree</strong> structure for the null values. Specifically, we construct a tree with $N$ leaves, where the $i$th leaf contains a $1$ if $A_i$ is not null, and $0$ otherwise. Each internal node contains the sum of the values in the leaves in the subtree rooted at the node. Replacing the $i$th index of the sequence with a null (or non-null) value corresponds to replacing the value of the $i$th leaf in this segment tree, and updating the corresponding internal nodes. To compute $f(i)$ given $i$, one only needs a single traversal down the tree to the appropriate leaf (utilizing the partial sums in the internal nodes).</p>
<p>Both operations run in $O(\log N)$ time, which means this structure is not the bottleneck of our solution.</p>
<h1>Insertions</h1>
<p>Insertions have the same problems as deletions when done naïvely: if we insert a value at a certain position, we need to update the indices of the later elements, and potentially update up to $O(N)$ points. The key is to implement the insertions so that:</p>
<ul>
<li>the whole array is allocated beforehand</li>
<li>each inserted value is inserted in the correct position, anticipating future insertions and deletions.</li>
</ul>
<p>For simplicity, instead of starting with an array of size $O(N)$, we can simply start with an empty array, and perform $O(N)$ insertions before the $Q$ queries (so now we have $N+Q$ queries, where the first $N$ are insertions). We preallocate an array of size $N+Q$, initially all null, and the key is to determine for each insertion the correct index where it should be placed so that future insertions can be accommodated without requiring a reallocation of the array. These correct indices can be precomputed by first simulating all insertions and deletions with a <strong>self-balancing</strong> segment tree (for example, an AVL tree modified to become a segment tree), with the following guidelines:</p>
<ul>
<li>All deletions are performed by marking the corresponding leaf that it has been "deleted". (this is needed so we can compute the function $f(i)$ quickly)</li>
<li>During each insertion, we store the <strong>index of the insertion</strong> in the newly added leaf. (i.e. its index in the sequence of operations)</li>
<li>After finishing all operations, we perform an inorder traversal of the tree (though we only care about the order of the leaves). Let's say the $i$th leaf was created during the $j$th insertion. Then when the $j$th insertion is actually performed, we simply replace the $i$th value with the correct non-null value.</li>
</ul>
<p>This precomputation runs in $O((N+Q)\log(N+Q))$. After that, we simply allocate an array of size $N+Q$, initially all null, and no more reallocation is required during the processing of the actual queries!</p>
<h1>Rectangle queries</h1>
<p>Finally, all that remains is to actually perform the rectangle queries efficiently. We'll clarify our task here: starting with a set of points (initially empty), we need to perform the following operations:</p>
<ul>
<li>Insert a new point $(x,y)$ to the set, assigning a value $v$ to it.</li>
<li>Delete an existing point $(x,y)$ from the set.</li>
<li>Given a rectangle $[a,b]\times [c,d]$ and an exponent $0\le e\le 3$, find the sum of the $e$th powers of values of all points in the rectangle.</li>
</ul>
<p>There are multiple (standard) approaches for this problem.</p>
<p><strong>2D range tree</strong></p>
<p>Note that this can be done with an $O(N' \log N')$ preprocessing and $O(\log^2 N')$ query time (where $N' := O(N + Q)$ is the number of operations) by using a <a href="https://en.wikipedia.org/wiki/Range_tree">2D range tree</a> fitted for updates: Since a 2D range tree is just a tree of segment trees, we use segment trees that support point insertions/deletions and range sums. In each leaf of a segment tree, we store four values, $(1,v,v^2,v^3)$, and in each internal node, we store four values: the sums of the values of each node for every exponent.</p>
<p><strong>sqrt decomposition</strong></p>
<p>You can also use the all mighty <strong>sqrt decomposition</strong> by noticing that each inserted point $(x,y)$ satisfies $0 \le x, y \le N$. Therefore, we can instead think of the space $[0,N]\times [0,N]$ as a 2D matrix. Now, we try to decompose this matrix into blocks of dimensions $K\times K$, so that there are $N/K \times N/K$ blocks (for some parameter $K$ which we'll determine later). We also store, for each exponent $0\le e\le 3$, an $(N/K+1)\times (N/K+1)$ table of <a href="https://en.wikipedia.org/wiki/Summed_area_table">2D partial sums</a>: $S_e(i,j)$ is the sum of all values from the first $i$ rows and first $j$ columns of blocks. Such a table can be computed in $O(N + (N/K)^2)$ time.</p>
<p>To implement the operations:</p>
<ul>
<li>
<p>To implement a rectangle query $(a,b,c,d,e)$, first we decompose our rectangle $[a,b]\times [c,d]$ into four "quadrants":
$$[a,b]\times[c,d] = [0,b]\times[0,d] - [0,a-1]\times[0,d] - [0,b]\times[0,c-1] + [0,a-1]\times[0,c-1]$$
Thus, we replace every rectangle query with four "quadrant queries", so we can assume from here on that $a = c = 0$.</p>
<p>Next, we let $b'$ and $d'$ be the largest multiple of $K$ that are $\le b$ and $\le d$, respectively. Then the answer would be $S_e(b'/K,d'/K)$ plus the values in the points in the thin strips $[b',b]\times [0,d]$ and $[0,b']\times [d',d]$. But the later points can be enumerated efficiently by taking note of the fact that <strong>there is at most one point in every row and column at any moment in time</strong> (why?). Therefore, there are at most $K$ points that need to be enumerated that way.</p>
<p>The running time of each query is $O(K)$.</p>
</li>
<li>
<p>To implement an insertion or deletion, we do not update anything; instead we simply lazily store the insertion and deletion operations we did so far in a list $L$ (so that an insertion/deletion runs in $O(1)$ time). When doing this, we need to update our "query" operation to look into this array and update the sum in $O(|L|)$ time, so the new running time of a query is $O(K + |L|)$.</p>
</li>
</ul>
<p>Unfortunately, $|L|$ can be up to $N+Q$, so an update can be very slow. The trick is to <strong>rebuild the table $S_e$ every $T$ operations</strong> (for some parameter $T$). The table can be rebuilt in $O(N + (N/K)^2)$ time, but the query operation optimizes to $O(K + T)$ time. </p>
<p>Now all that remains is to choose the parameters $K$ and $T$. Note that there are $N' = N+Q$ queries, and after every $T$ insertions, we perform an $O(N+(N/K)^2)$ time "rebuild" operation. Therefore, the running time after $N'$ operations is:
$$O\left(\frac{N'}{T}\left(N+\left(\frac{N}{K}\right)^2\right) + N'\left(K+T\right)\right) = O\left(N'\left(\frac{1}{T}\left(N+\left(\frac{N}{K}\right)^2\right) + K + T\right)\right)$$
It can be shown that the optimal choice for the values $K$ and $T$ is $K = T = O(\sqrt{N})$ (hence "sqrt decomposition") (why?), so that the running time is $O(N'\sqrt{N}) = O((N+Q)\sqrt{N})$.</p>
<p><strong>2D Fenwick tree</strong></p>
<p>Finally you can also use a <a href="https://www.topcoder.com/community/data-science/data-science-tutorials/binary-indexed-%20trees/#2d">2D Fenwick tree</a> to perform the queries in $O(\log^2 N)$ time. The problem is that the memory requirement becomes $O(N^2)$. There are ways to reduce the memory requirement, for example using a hash map for every 1D Fenwick tree (but this has a rather large constant behind the $O$ notation).</p>
<p>We leave it to the reader to discover the details on how to optimize this. If you need more hints, feel free to ask. Also, the author's solution uses this approach so you can check that solution too.</p>
<h1>Notes</h1>
<p>If you have some questions, or would like some parts of this editorial in more detail, please feel free to ask in the comments below, and I'll try to accommodate your request :)</p>
<h1>Time Complexity:</h1>
<p>$O((N + Q) \sqrt{N})$ or $O((N + Q) \log^2 N)$ </p>
<h1>AUTHOR'S AND TESTER'S SOLUTIONS:</h1>
<p><a href="http://www.codechef.com/download/Solutions/AUG15/Setter/DISTNUM.cpp">setter</a><br>
<a href="http://www.codechef.com/download/Solutions/AUG15/Tester/DISTNUM.cpp">tester</a> </p>kevinsogoSun, 23 Aug 2015 23:49:20 +0530https://discuss.codechef.com/questions/74081/distnum-editorial2daug15sqrt-decompsegment-treesuper-hardeditorialrange-treefenwick-tree