Problem LinkSetter: Trung Nguyen Tester: Hasan Jaddouh Editorialist: Bhuvnesh Jain DifficultyMEDIUM  HARD PrerequisitesSqrt Decomposition, Hashing, Online Queries ProblemYou are given an array of size $N$ and you want to process $Q$ online queries on it. For each query, you need to find whether the multiset formed from 2 subarrays in each query are isomorphic or not. ExplanationFor the editorial, assume that $f(Z)$ corresponds to the mapping of $Z$ from the first subarray to the second subarray as defined in the problem statement. Let us first understand when would the multisets formed from the 2 subarrays be isomorphic. It is clear that the number of occurrences of $Z$ in first subarray and the number of occurrences of $f(Z)$ should be same. For exmaple  If the 2 multisets are {$1, 2, 3, 3$} and {$4, 2, 5, 5$}. Then we have a mapping of $f(1) = 4, f(2) = 2, f(3) = 5$. i.e. the multisets are isomorphic. But none of the above subarrays are isomorphic to {$1, 3, 5, 6$}. This is the necessary and sufficient condition for checking the isomorphism. Thus, if we are able to build the frequency array of the numbers present in the subarray and compare the 2 frequency lists efficiently, we can solve the problem. The first thing to notice is that the size of frequency list of the subarrays can be the order of length of the subarray (as the numbers can be unique in the subarray). Thus, even building the frequency list even in order length of subarray would be costly per query and timeout. This is where hashing comes into play. Before proceeding forward, I recommend you to read this blog by "rng_58" and understand it thoroughly. The blog clearly mentions a way to check with high probability whether 2 multisets (here we are considering the multisets of the frequencies) are isomorphic or not. So, assume the frequency multiset of the subarray looks like {$f_1, f_2, .. f_x$}. We need to be efficiently calculate the function $\prod_{i=1}^{i=x}(f_i + R)$, where $R$ is a randomly chosen number in the range $[0, Mod)$. We also need to calculate the above function online and then just check whether the 2 hashed values of the 2 subarrays are same or not. To efficiently calculate the above function, we will rely on the technique of sqrt decomposition. The basic idea of sqrt decomposition of arrays is to split the array into sqrt blocks, calculate the function of each sqrt block efficiently and iterate over the edges points (if they exist) in each query. But here the function we want to evaluate deals on the frequency of numbers which is difficult to update with each passing block as in normal sqrt decomposition. The algorithm for the sqrt decomposition part will be as follows:
Some caveats to the above solution:
Last and most important part of this problem: Choosing the value of K. See as in normal questions, generally $\text{N ~ Q}$. But here the constraints are bit different. Also, another step computing the pairwise function for contiguous blocks is also done. So, the value of $K$ must be chosen carefully. We can see that following are the complexities: PreComputation = $2 * N * \frac{N}{K} + {(\frac{N}{K})}^{2} \approx 2 * N * \frac{N}{K}$ Query part = $2 * Q * K$ Thus, we should chose $K$, such that both the above parts are equal i.e $K = \frac{N}{sqrt(Q)}$. Not choosing $K$ carefully can lead many solutions to TLE. Another thing to note that in hashing problems, sometimes the number of mod operations can also increase the constant factor in your code, sometimes leading to TLE. This is also the case here. So, it is important that you try to keep the number of Mod operations $(\%)$ to $O(N + Q)$. For more details, refer to the author's or tester's approach. The author has the same approach but used a different hashing mechanism. If you had some other approach, feel free to share the same. Time Complexity$O((N + Q) * Block)$, per test case, where $Block$ = size of block chosen. Space Complexity$O(N * Block)$ Solution Links
This question is marked "community wiki".
asked 22 Dec '17, 01:24

"Let P(x_1, ..., x_k) be a nonzero (multivariable) polynomial on a finite field F_MOD. If the variables are chosen independently and uniformly at random, the probability that P(x_1, ..., x_k) = 0 is at most D/MOD, where D is the total degree of the polynomial." Assumption, two multisets are a_{1}, a_{2},..., a_{k} and b_{1}, b_{2},..., b_{d}. Let P(x(a_{1}), x(a_{2}), ..., x(a_{k}), x(b_{1}), x(b_{2}),..., x(b_{d})) = x(a_{1}) + x(a_{2}) + ... + x(a_{k})  x(b_{1})  x(b_{2})  ...  x(b_{d}). Also, I see mine is not much different from normal string hashing :) answered 26 Dec '17, 02:21

Thanks for an editorial! I got a question which isn't exactly related to the idea of solution, but to a difference in hashing used by setter and editorialist. Idea by setter/tester sounds to me like "every number has random key given, two multisets are most likely equal if sums of keys for these multisets are equal". I'm actually more used to this kind of hashing for such problems; and it also doesn't require you to precompute modulo inverses :) Now the question is  since this hashing is quite different from the one described in blog by Makoto, can somebody share some proofs/statements about how good it is? I'm used to thinking about it as "value of the sum is pretty much random, so let's say collision chance is 2^(64)", but it is far from being strict proof. answered 26 Dec '17, 01:41
