PROBLEM LINK:Setter: Bogdan Ciobanu DIFFICULTY:Medium PREREQUISITES:Xor operation, Precomputation, Meet in the middle. PROBLEM:Given a sequence $A_0$ of length $N$ indexed 1based, we define an operation $f$ on array as $A_x[i] = A_{x1}[i] \oplus A_{x1}[i+1] \forall i \in [1, N1]$ and $A_{x}[N] = A_{x1}[N]$. We have to answer queries: Given T and P, answering the value of $A_T[p]$. Note that $T \leq 10^9$. SUPER QUICK EXPLANATION
EXPLANATIONFor this problem I'll be discussing two solutions, one slow one to illustrate how this process actually works, followed by another slow solution, which we will, at last, optimize by precomputation. First of all, let's make an observation, Sequence $A_x$ is same as $A_0$ if $x$ is a multiple of power of two $ > N$, that is, period of sequence A is $M = 2^{\lfloor \log_2 N \rfloor +1}$. Proof: Reason is that, after $M$ operations, Each elements effect on elements to their left gets neutralised because of property $A \oplus B \oplus B = A$. Due to this property, the value $A[j]$ is xored in $A_x[i]$ if, after x operations, A[j] is xored with A[i] odd number of times. See example array 1 2 4. 0: 1 2 4 8 16 See, last element never changes, last 2 element changes with period 2, last 3 as well as last 4 elements changes with period 4 while if $N = 5$, sequence repeat after 8 operations which is same as $M = 2^{\lfloor \log_2 N \rfloor +1}$. First slow solution We make a table B[x][i] where B[x][i] represent A[x][i]. We can calculate this table for x upto $M$ where $M = 2^{\lceil \log_2 N \rceil}$. Since after $M$, the sequnce repeat, we can always take $T \bmod M$ and answer the query. It is easy to calculate the whole matrix in $O(N*M)$ time, which will time out. Second slow solution Consider a query $(T, P)$. Lemma: Let us consider all submasks of T when written in binary form, say mask. Now, the Final answer will be Xor sum of A[p+mask] for all submasks of T, for all valid positions ($p+mask \leq N$). Proof: Consider the following table. Here ith column in Tth row being 1 means, that $B[T][pos]$ include xorsum of $A[pos+i]$. (Columns are numbered from 0). 0: 1 The first column has 1 always, Becuase $A[X][P]$ will always include contribution of $A[0][P]$. After 0 operations, $A[0][P]$ is just as given in input. After 1 operation, $A[1][P] = A[0][P] \oplus A[0][P+1]$. After 2 operations, $A[2][P] = A[1][P] \oplus A[1][P+1] = A[0][P] \oplus A[0][P+1] \oplus A[0][P+1] \oplus A[0][P+2] = A[0][P] \oplus A[0][P+2]$ After 3 operations, $A[3][P] = A[2][P] \oplus A[2][P+1] = A[0][P] \oplus A[0][P+2] \oplus A[0][P+1] \oplus A[0][P+3]$ If we simulate this process forward, we can see, that in next step, $A[4][P] = A[0][P] \oplus A[0][P+4]$. This works the same way as submasks of T. At T = 1, Submasks of T are 0 and 1, which means, $A[1][P] = A[0][P] \oplus A[0][P+1]$ At T = 2, Submasks of T are 0 and 2, which means, $A[2][P] = A[0][P] \oplus A[0][P+2]$ At T = 3, Submasks of T are 0, 1, 2 and 3, which means, $A[3][P] = A[0][P] \oplus A[0][P+1] \oplus A[0][P+2] \oplus A[0][P+3]$ We can see, how the contribution of $A[0][P+1]$ disappeared, just the same way, 1 is a submask of 1, while 1 is not a submasks of 2. We can use this fact to make another slow solution, where for a given query $(T, P)$ we check for all positions $X \in [P, N]$ and if $XP$ is a submask of $T$, we take its xorsum. This solution requires no preprocessing, but takes $O(N)$ time per query, making the solution run in $O(Q*N)$ time, once again exceeding the time limit. Faster solution We will combine both solutions. First of all, Observe the following in above table. 0: 1 0 0 0 We can see, that for every 4 values of $T$, the positions included in the range $[P, P+4)$ remain the same because these values have a cycle of 4. Now, If we precompute the table for the first 4 values, we can skip directly over 4 positions. See, Suppose we have $T = 5$ for some position $P$. It will imply $A[5][P] = A[0][P] \oplus A[0][P+1] \oplus A[0][P+4] \oplus A[0][P+5]$. Now, see last two bits of T, when written in binary form. They are 01 which is 1. The thing to understand here is, that $A[1][P] = A[0][P]+A[0][P+1]$ and in same way, $A[1][P+4] = A[0][P+4] \oplus A[0][P+5]$. Hence, $A[5][P]$ can also be written as $A[1][P] \oplus A[1][P+4]$. Hence, $A[5][P] = A[1][P] \oplus A[1][P+4]$. The reason it works is, that if we have calculated the table till values X, $X = 2^B$ for some $B$,We have actually divided the array into blocks of size X. Lower B bits of $T$ decide, which values to be included in the answer from every block while upper bits decide, which blocks to be included. Suppose we have removed the lower $B$ bits of $T$, now, once again, by similar reasoning, we see that values included in xorsum will xor value from blocks which are submasks of $T$. The lower bits automatically choose the right value from blocks. See, Let $T = 14 = 1110_2$. Lower $B$ bits of $T$ is 10 which is 2, and $T$ after removing lower $B$ bits is 11 which is 3. Now, We can rewrite $A[14][P]$ as $A[2][P] \oplus A[2][P+1*(2^B)] \oplus A[2][P+2*(2^B)] \oplus A[2][P+3*(2^B)]$. The reason for multiplying by $2^B$ is that we skip directly to the next block which is at $2^B$ steps from the current block. For every block, $A[lb][P]$ includes xorsum of all values from the block of size $2^B$. Here too, we iterate over submasks of $T$ after removing lower $B$ bits of $T$, and for the submask, we include the required value from the block corresponding to that submask. Hence, we manage to solve the problem in $O(N*X + Q*N/X)$ for $X = 2^B$. By choosing a suitable value of $B$, we can solve this problem fast enough. For quickly iterating over a submask of a value, see the following pseudocode
Refer more about iterating over submasks here. Related problemsA problem on very similar idea can be found here. There was another problem on very same idea in one of the past contests at codechef, i cannot find now. If anyone has link, please post in comments. Another recent problem, which required iterating over submasks of a number along with dynamic programming, can be found here. Time ComplexityTime complexity is $O(N*(2^B))$ for precomputation and $O(Q*N/(2^B))$ for answering queries. Choosing Value of $B = 7$, we can get the solution to run fast enough to fit inside the time limit. Memory Complexity of this solution is $O(N*(2^B))$ for storing the precomputed table. AUTHOR'S AND TESTER'S SOLUTIONS:Setter's solution Feel free to Share your approach, If it differs. Suggestions are always welcomed. :)
This question is marked "community wiki".
asked 04 Nov '18, 14:36

https://cpalgorithms.com/algebra/allsubmasks.html this link should be i guess for submasks iterating @taran_1407
Thanks for pointing out. The link had got typed twice. :)