Questions Tagged With fastmodexphttps://discuss.codechef.com/tags/fastmodexp/?type=rssquestions tagged <span class="tag">fastmodexp</span>enSat, 15 Dec 2018 17:57:08 +0530XORTREEH - Editorialhttps://discuss.codechef.com/questions/114200/xortreeh-editorial<h1>PROBLEM LINK</h1>
<p><a href="https://www.codechef.com/problems/XORTREEH">Practice</a><br>
<a href="https://www.codechef.com/OCT17/problems/XORTREEH">Contest</a></p>
<p><strong>Author:</strong> <a href="https://www.codechef.com/users/adkroxx">Adarsh Kumar</a><br>
<strong>Tester:</strong> <a href="https://www.codechef.com/users/alex_2008">Alexey Zayakin</a><br>
<strong>Editorialist:</strong> <a href="https://www.codechef.com/users/xellos0">Jakub Safin</a></p>
<h1>DIFFICULTY</h1>
<p>HARD</p>
<h1>PREREQUISITIES</h1>
<p>Fourier transform, number theoretic transform (NTT), fast (modular) exponentiation and modular inverse</p>
<h1>PROBLEM</h1>
<p>You're given an array $A$ of $N$ non-negative integers and integers $K,X$. </p>
<p>Define the XOR of two numbers $a\oplus b = c$ in base $K$ this way: the $d$-th digit of $c$ in base $K$ is $c_d=a_d+b_d$ modulo $K$.</p>
<p>Compute the probabilities $p_i$ that the base $K$ XOR of <a href="https://en.wikipedia.org/wiki/Mex_(mathematics)">mex</a> values of $X$ randomly selected subsequences of $A$ is equal to $i$, for all possible $i \ge 0$, modulo 330301441.</p>
<h1>QUICK EXPLANATION</h1>
<p>The mex won't be too large, XORs won't be too large either. Compute the probabilities for getting all possible mex-s of one subsequence. The probabilities for XOR of $X$ subsequences can be computed using slow multidimensional number theoretic transform, where each dimension is a digit and array size is $K$ instead of a power of $2$, combined with fast exponentiation.</p>
<h1>EXPLANATION</h1>
<p>The problem statement mentions the answer should be a complicated sum $\sum (i^2 p_i^3)^i$ over all $p_i > 0$ (obviously, the terms with $p_i=0$ don't affect the sum since that only happens with $i > 0$). However, that's not important. The sum is just a hash value that's there to avoid having to print large numbers and we can compute it after finding all $p_i$. Let's just mention that the $i$-th powers can be computed using <a href="https://www.khanacademy.org/computing/computer-science/cryptography/modarithmetic/a/fast-modular-exponentiation">fast exponentiation</a>.</p>
<p>Since we need to compute the result modulo $MOD$, we need to work with fractions (all $p_i$ will be rational numbers) as their equivalents modulo -- dividing by $Q$ corresponds to multiplying by its modular inverse. Since the given modulo is a prime, the inverse $Q^{-1}=Q^{MOD-2}$ according to Fermat's little theorem, which can be computed using the above mentioned fast exponentiation.</p>
<p>Which values of $i$ give non-zero $p_i$? Obviously, the mex of an array of size $N$ can't be more than $N$, since the opposite would require all integers between $0$ and $N$ to be present in $A$. We can interpret numbers $\le N$ in base $K$ as numbers with at most $D$ digits, or exactly $D$ digits including leading zeroes, where $D=\left\lceil \log_K N \right\rceil$. Xor-ing $D$-digit numbers gives a $D$-digit number again, so the xor of $X$ numbers is $< K^D$. Therefore, it's sufficient to compute $p_i$ only for $i < K^D$, which makes $O(KN)$ numbers. That's not too much.</p>
<h2>From mex to probabilities</h2>
<p>Let's find the probabilities $P_1(i)$ that the mex of a random subsequence of $A$ will be equal to $i$, e.g. the probabilities $p_i$ if $X=1$. As mentioned above, we can limit ourselves to $i \le N$.</p>
<p>We can compute just the number of subsequences $S_1(i)$ that give mex equal to $i$ and then normalise those values -- divide them by $\sum S_1(i)$, or rather multiply by its multiplicative inverse -- to get $P_1(i)$.</p>
<p>If the mex of some subsequence is $i$, then all elements $A_j=i$ can't be in the subsequence. Any of the elements $A_j > i$ can be in there, but it doesn't matter; if there are $g$ such elements, that gives $2^g$ possibilities. Finally, for any $0 \le k < i$, there must be at least one element $A_j=k$ present in the subsequence. If there are $s_k$ such elements for a given $k$, then there are $2^{s_k}-1$ ways to choose them (any non-empty subset). We can express</p>
<p>$$S_1(i) = 2^g \prod_{k=0}^i \left(2^{s_k}-1\right) = \prod_{k=i+1}^{A_{max}} 2^{s_k} \prod_{k=0}^i \left(2^{s_k}-1\right)\,,$$</p>
<p>where $A_{max}$ is the maximum element in $A$, since $g$ is just the sum of $s_k$ for $k > i$.</p>
<p>Using fast exponentiation, we can precompute all $s_k$, $2^{s_k}$, their suffix products and prefix products of $2^{s_k}-1$ (similarly to prefix sums) and compute $S_1(i)$ and $P_1(i)$ for all $i \le N$ using the given formula in $O(A_{max}+N\log N)$; it doesn't even need to depend on $A_{max}$ if we notice that since the mex can't be greater than $N$, setting $A_i := min(N+1,A_i)$ doesn't affect the result.</p>
<p>This approach is fast with only $O(K^D)=O(KN)$ time complexity.</p>
<h2>Walsh-Hadamard transform</h2>
<p>Look at the straightforward way to compute probabilities $P_2(i)$ for $X=K=2$ from $P_1(i)$:</p>
<p>$$P_2(i) = \sum_{j=0}^N P_1(j) P_1(i\oplus j)\,.$$</p>
<p>It's very similar to convolution of two arrays, the only difference is that we're using $\oplus$ instead of $+$. The convolution of 2 arrays can be computed using fast Fourier transform by computing the FFT of both arrays extended to size $2^k \ge 2N$, multiplying their corresponding elements and computing the inverse FFT of the resulting array; for this xor-convolution, it's very similar, but we're using something called fast Walsh-Hadamard transform instead. You can read about it <a href="https://csacademy.com/blog/fast-fourier-transform-and-variations-of-it">here</a>.</p>
<p>For general $X$, the fast way to compute all $P_X(i)$ is to compute the Walsh-Hadamard transform $WH\lbrack P_1\rbrack(\nu)$, take $B(\nu)=WH\lbrack P_1\rbrack^X(\nu)$ (using fast exponentiation) and compute $P_X(i)$ as the inverse Walsh-Hadamard transform $P_X=WH^{-1}\lbrack B\rbrack$. However, this only works for $K=2$, where the conventional xor is defined.</p>
<p>The following is actually a generalised version of WHT for arbitrary $K \ge 2$.</p>
<h2>A better approach: number theoretic transform</h2>
<p>This approach uses the specific value of $MOD$. If we compute small factors of $MOD-1$, we can see that all numbers from $2$ to $10$ -- all possible $K$ -- divide it! That means we can use the number theoretic transform, which allows us to treat the base-$K$ xor as what it actually is: summation modulo $K$.</p>
<p>On the other hand, we're going to need the multidimensional version. We can look at an index $i$ as a vector of $D$ digits $(i_1,\dots,i_D)$; the xor of 2 vectors is actually just their vector sum and then taking the remainder mod $K$ in each digit. The formula for $P_2(i)$ then becomes $\sum_j P_1(j_1,\dots,j_D) P_1((i_1-j_1)\%K,\dots,(i_D-j_D)\%K)$, which is just multidimensional convolution with indices modulo $K$.</p>
<p>So how do we do convolution with indices modulo $K$? We don't need to do anything -- turns out convolution using DFT or NTT is already done with indices modulo array size! That's why we need the trick with padding the array with zeroes at the end to at least twice the size (it's a power of 2 just so that it'd run fast): when we're doing $C_{i+j} += A_i B_j$, we're only adding a non-zero number if $i+j < 2N$, so taking it modulo $2N$ does nothing.</p>
<p>The reason why this happens is apparent if we look at how DFT or NTT works. For an array of size $N$, we choose a number $w$ such that $w^k \neq 1$ for $0 < k < N$ and $w^N=1$ and compute $F\lbrack A\rbrack(j) = \sum w^{jk} A_k$ for each $0 \le j < N$. The inverse transformation uses $1/w$ instead of $w$. DFT uses $w=e^{2\pi i / N}$; for NTT, it's a so-called primitive root -- a number for which the required conditions hold modulo $MOD$. There's no easy way to pick a primitive root, but since we have a fixed modulo and array sizes $N$ ($N=K$ here) are small, we can compute them e.g. by bruteforcing locally for all possible values of $K$ and hardwire them into the code.</p>
<p>Anyway, since $w^N=1$, there's no difference in what we're computing if we take some index modulo $N$. We can run DFT or NTT to get $F\lbrack P_1\rbrack$ without increasing its size to a power of $2$, then take $F\lbrack P_X\rbrack(j)=F\lbrack P_1\rbrack(j)^X$ for each $j$ and finally compute $P_X$ using an inverse transform and it gives us the probabilities we need.</p>
<p>There's a small drawback here: we're transforming arrays of size $K$, so there's no way to use the "butterfly scheme" of classic FFT. However, we don't need that. In order to stay in integers and get the required precision, we're going to use multidimensional NTT modulo $MOD$ by simply computing the required sums directly. Small array sizes help a lot, since this bruteforce approach runs in $O(K)$ per element per dimension.</p>
<p>We have $O(K^D)=O(KN)$ elements (the former bound is tighter and we need to work with fixed $D$ anyway, so let's use that) in $P_1$ and $P_X$, so each NTT (direct and reverse) runs in $O(K^{D+1}D)$. Between them, there are $O(K^D)$ fast exponentiations in $O(\log MOD)$. The total time complexity is therefore $O(K^D(KD+\log{MOD}))$; memory complexity $O(K^D)$.</p>
<h1>AUTHOR'S AND TESTER'S SOLUTIONS</h1>
<p><a href="http://www.codechef.com/download/Solutions/OCT17/Setter/XORTREEH.cpp">Setter's solution</a><br>
<a href="http://www.codechef.com/download/Solutions/OCT17/Tester/XORTREEH.cpp">Tester's solution</a><br>
<a href="https://ideone.com/pksASp">Editorialist's solution</a></p>xellos0Mon, 09 Oct 2017 15:54:48 +0530https://discuss.codechef.com/questions/114200/xortreeh-editorialhardfftnttfastmodexpoct17xortreeheditorialfast-expoGOLDMINE - Editorialhttps://discuss.codechef.com/questions/142120/goldmine-editorial<h1>PROBLEM LINK:</h1>
<p><a href="https://www.codechef.com/BULE2018/problems/GOLDMINE">Contest</a></p>
<p><strong>Author:</strong> <a href="http://www.codechef.com/users/aman_58472">Aman Kumar Gupta</a></p>
<p><strong>Editorialist:</strong> <a href="http://www.codechef.com/users/aman_58472">Aman Kumar Gupta</a></p>
<h1>DIFFICULTY:</h1>
<p>EASY-MEDIUM.</p>
<h1>PREREQUISITES:</h1>
<p>DP, Math, Matrix, Fast exponentiation.</p>
<h1>PROBLEM:</h1>
<p>The number of ways you can mine a row of size <b>1 * n</b> using given types of bombs modulo <b>10^9+7</b>.</p>
<h1>EXPLANATION:</h1>
<p>The first method which can be used is DP of course. It seems to be a classical DP problem.</p>
<p>So when the size of the row is 1 * 1, dp[1] = 0. When the size of the row is 1 * 2, dp[2] = 1 and similarly dp[3] = 1, where dp[i] refers to the ways in which the row of size <b>i</b> can be mined.</p>
<p>And from here upon, build the dp array as: dp[i] = (dp[i - 2] + dp[i - 3]) % 1000000007;</p>
<p>Finally dp[n] would be our answer.</p>
<p>But this will fail here because of the large constraint.</p>
<p>Now, suppose we have an equation like this: <b>[dp[0], dp[1], dp[2]] * M = [dp[1], dp[2], dp[3]]</b>, where <b>M</b> is a matrix of size 3 * 3, then we can say <b>[dp[0], dp[1], dp[2]] * M^2 = [dp[2], dp[3], dp[4]]</b>.</p>
<p>On generalizing, <b>[dp[0], dp[1], dp[2]] * M^(N - 2) = [dp[n - 2], dp[n - 1], dp[n]]</b> for <b>N > 2</b></p>
<p>To find the matrix, consider the matrix cells unknown and multiply it with the dp array on the left and equate it with the RHS. Leaving the task upon you, the matrix will be:<br>
0 0 1<br>
1 0 1<br>
0 1 0<br></p>
<p>Now <b>M^(n - 2)</b> can be found in <b>27 * log n</b> and after all the multiplications of the above equation, our answer will simply be <b>FinalArray[2] (the last element of our resulting array on the RHS).</b><br><br>
<b>Note:</b> You need to take care of taking modulo on every operation to avoid the overflow. See the solution for more references.</p>
<h1>AUTHOR'S SOLUTION:</h1>
<p>Author's solution can be found <a href="https://ideone.com/2HOmP3">here</a>.</p>aman_58472Sat, 15 Dec 2018 17:57:08 +0530https://discuss.codechef.com/questions/142120/goldmine-editorialdynamic-programmingbule18fastmodexpeditorialeasy-mediumBIPIN3 - editorialhttps://discuss.codechef.com/questions/80653/bipin3-editorial<h1>PROBLEM LINK:</h1>
<p><a href="http://www.codechef.com/problems/BIPIN3">Practice</a><br>
<a href="http://www.codechef.com/APRIL16/problems/BIPIN3">Contest</a></p>
<p><strong>Author:</strong> <a href="http://www.codechef.com/users/bipin2">Bipin Baburaj</a> <br>
<strong>Tester:</strong> <a href="http://www.codechef.com/users/xcwgf666">Sergey Kulik</a> <br>
<strong>Editorialist:</strong> <a href="http://www.codechef.com/users/mugurelionut">Mugurel Ionut Andreica</a></p>
<h1>DIFFICULTY:</h1>
<p>SIMPLE</p>
<h1>PREREQUISITES:</h1>
<p><a href="https://en.wikipedia.org/wiki/Modular_exponentiation">Fast modular exponentiation</a></p>
<h1>PROBLEM:</h1>
<p>A rooted tree with <strong>N</strong> nodes is considered. We need to assign a color from <strong>1</strong> to <strong>K</strong> to each node of the tree in such a way that every pair of nodes <strong>(A,B)</strong>, where <strong>A</strong> is the parent of <strong>B</strong>, has different colors.</p>
<h1>QUICK EXPLANATION:</h1>
<p>The answer is $K*(K-1)^{N-1}$. Note that the actual tree is not given. This is because there is the same answer for every tree with <strong>N</strong> nodes (and for the given <strong>K</strong>).</p>
<h1>EXPLANATION:</h1>
<p>We can choose any of the <strong>K</strong> colors for the root of the tree. Then we can choose any color except the root's color for any of the root's children. Then, for every child of a child of the root we can choose any color except that of its parent, and so on. Thus, for each of the other <strong>N-1</strong> nodes we have <strong>K-1</strong> choices for its color. The overall answer is $K*(K-1)^{N-1}$.</p>
<p>For subtasks 1 and 2 one can compute the answer in <strong>O(N)</strong> time. For subtask 3 one needs to compute $(K-1)^{N-1}$ (modulo $10^9+7$) using fast modular exponentiation, in <strong>O(log(N))</strong> time.</p>
<h1>AUTHOR'S AND TESTER'S SOLUTIONS:</h1>
<p>Author's solution can be found <a href="http://www.codechef.com/download/Solutions/APRIL16/Setter/BIPIN3.cpp">here</a>.<br>
Tester's solution can be found <a href="http://www.codechef.com/download/Solutions/APRIL16/Tester/BIPIN3.cpp">here</a>. </p>mugurelionutSat, 02 Apr 2016 02:48:29 +0530https://discuss.codechef.com/questions/80653/bipin3-editorialsimplebipin3april16editorialfastmodexpFast exponentiaion and multiplicationhttps://discuss.codechef.com/questions/134352/fast-exponentiaion-and-multiplication<p>Hi,</p>
<p>I am unable to understand the iterative version of fast modular exponentiation code.</p>
<pre><code>long long prod(long long a, long long b, long long mod = MOD)
{
long long res = 0;
while(b)
{
if(b & 1)
res = (res + a) % mod;
a = (a + a) % mod;
b >>= 1;
}
return res;
}
long long bpow(long long a, long long b, long long mod = MOD)
{
long long res = 1;
while(b)
{
if(b & 1)
res = prod(res, a, mod);
a = prod(a, a, mod);
b >>= 1;
}
return res;
}
</code></pre>
<p>The prod function does overflow safe multiplication, again similar to fast exponentiation code, but hard to get.</p>rainaswayamSat, 25 Aug 2018 14:16:57 +0530https://discuss.codechef.com/questions/134352/fast-exponentiaion-and-multiplicationfast-expoexponentiationfastmodexpfastmul