 # CLONEME Editorial (Unofficial)

Prerequisites

Binary Search

Quick Overview

Given an array and two ranges you need to find whether the subarray lying in these two ranges have mismatch of less than 1 or not when elements are compared correspondingly in a sorted order

Solution

For 10 Points

It’s quite easy you can just copy the two subarrays in the new array and sort it in O(N log N) and check the corresponding values for mismatch.if mismatch is more than 1 output NO else YES.

For 30 Points

I used a simple approach in O(n) with some optimization.

What I did was made a frequency array for all values.Since A[i]<=100000 it’s feasible.

When u will create this frequency array we can easily maintain that values inserted in the vector are in sorted order only.

Now you can simply iterate over all values of given array from a to b and check the frequency of that value in two range [a b] and [c d] using Binary search.Two cases arise then:

IF diff in freq==1

we store the index of that value which is more in either range and increment our counter.

IF diff in freq>1

it means that 2 or more elements are different & hence no need for further search we can break the operation and print NO.

Now if till end diff in freq ==0

but if diff in freq ==1

You need to check that if there is some element which lies between these two different elements or correspondingly you can found the rank of those different element in both arrays.

If rank is same:

They will be at same position & hence output YES

else

They will lie at different position & hence output NO

Optimization

1. I did was avoid checking same values by maintaining a bitset. It compensates for the time taken in the binary search to count frequency.

2. I changed a,b,c,d for overlapping ranges to avoid searching for the same element.

Here is the solution to my code implemented that way.

4 Likes

I tried running your solution on extreme case:

t = 1

n = 10^5

q = 10^5

A_{i} = 1 \forall i

Then for each test; a, b, c, d are choosen randomly such that b - a = d - c.

Now your solution takes nearly 36 seconds to run.

Test link: input.txt (nearly 2.5 MB)

UPD1 - Log:

\$ /usr/bin/time -f "Elapsed time: %E seconds\n" ./a.out < in > o
Elapsed time: 0:36.60 seconds

14 Likes

No, it can’t take O(N log N) bcoz if all arrays elements are same then Binary Search range is large which can be at max O(log n) but then the search is only for 1 element.But in worst case, if all array elements are different then binary search is across one element and it takes at most O(N) thus worst case is O(QN/2)=O(QN)
Divide by 2 bcoz of modified search range

A very intuitive and nice solution is as follow:

First let us make a prefix sum , sum of squares, sum of 3/2 powers and denote them by sum,sum2,sum3/2 respectively.
Now when these three are equal for a given l1,r1,l2,r2 i.e (sum[r1]-sum[l1-1]==sum[r2]-sum[l2-1] )&&(sum2[r1]-sum2[l1-1]==sum2[r2]-sum2[l2-1] )&&(sum3/2[r1]-sum3/2[l1-1]==sum3/2[r2]-sum3/2[l2-1] ) then ans is YES.

Else,
Let us assume EXACTLY one mismatch.Now find the values of the mismatched elements(let them be a in [l1,r1] and b in [l2,r2]).These can be calculated using the sum and sum of square equations.Now if our assumption is corrct(of exactly one mismatched element) then a and b must come out to be integers and also they must satisfy the sum3/2 equation.Also they must share atleast one common index in both the ranges(Think why!).This can be checked using any suitable method.If these conditions are ture ans is YES.Else ans is NO.

6 Likes

@anushi @fallandrise this indicates poor test cases. @fallandrise is correct about the complexity. @anushi you are correct Codechef doesn’t have n = 10^5 and q = 10^5, but that doesn’t it shouldn’t.

I believe that swapping ranges for overlapping segments is incorrect. If the array is 1 2 3 and we ask about ranges [1,2] and [2,3] then the answer should be NO but shrinking it to [1,1] and [3,3] yields YES.

@anushi: Why does this range trimming fails? https://www.diffchecker.com/mTNjMBYA
The left diff passes for first subcases, while the right diff fails.

I solved it with sqrt-decomposition in O((n+q) \cdot \sqrt n). The idea is to clone (yeah, why not) the input array B = \sqrt n number of times In particular, the i-th copy contains only numbers from range [i \cdot B, (i+1) \cdot B)

Next, for each such copy of the array, we compute its prefix hashes, where h(i, j) is the hash of the j-th prefix of the i-th array, where integer v occurring c times in such a prefix adds value c \cdot P^v to this hash, and P is a prime greater than 10^5. This allows us to compute hashes of any range of every copy of the array.

Next, for each query, we compare hashes corresponding to ranges in the query one by one and there are few cases to consider. For implementation details refer to my solution here: https://www.codechef.com/viewsolution/14105123

1 Like

This is an inefficient solution for given constraints…

However i need some help on my one of the solutions if anyone could. I did this by a q*(logn)^2 logic but that gave me tle in 3rd subtask, i tried to optimize the solution a lot but it didnt pass

Logic : I did it using persistent segment trees. i stored count of elements as well as hash value of them in the tree. Then using binary search and qry function(which returns cumulative hash uptil required position)i found 1st xth position for a,b c,d such that cumulative hash uptil that xth position doesnt match. and checked the cumulative hash of the values on the right side of this position. If they are equal then yes else no.

Even after loads of submissions i coudnt remove the tle by this logic but i really want to know reason for tle considering my friends q*(logn)^3 logic passed.
If anyone could tell me.

Later on i combined bs with qry and did in q*logn which passed

1 Like

Your solution will run quite fast on a randomised test case . But we can deliberately create test cases where your solution would take forever to run.

One of the possible hacks is

N = 10**5
Q = 10**5
hack = range(1 , N - 3 + 1)
hack += [N]*3
hack = hack = 1
print "1"
print N , Q
print ' '.join(map(str , hack))
for _ in range(Q):
print 1 , N - 3 , 2 , N - 2


It is a test case in which , the first range freq = 3 , freq[2…99997] = 1 and for the second range freq = 2 , freq[2…99997] = 1, freq = 1 . I guess we can hack a lot of solutions using this test case.

Input file

2 Likes

How every one guessed this technique of hashing…??
I mean most of the solutions are based on the technique of raising the sum of elements to some power like 2,3/2,even 4…i mean is there any standard method of doing such question…Any one knows any links to learn about this hashing and all…cz i didnt heard about it until now…

Weak test cases are getting quite common here since last few long contests.

4 Likes

@abx_2109: Could you provide more insight of your solution.How is taking sum of power 3/2 helping us?

Can someone please tell me what is wrong with my solution - https://www.codechef.com/viewsolution/14259973

I created a hash value corresponding to each range and if it was different for both ranges then I calculated values x from first range and y from second range that were mismatched and then checked their position

I just want to know what is wrong in my approach or possibly in my implementation

Hey guys,

I wrote an editorial of this problem on my blog.

Please take a look, and give me a feedback. I really need it.

Thanks everyone and Codechef!

1 Like

Yes. Each query can take O(nlogn) time in worst case. So it is more of QN*logN, bad to see this pass.

3 Likes

Something might be wrong in your calculation bcoz bitset allows to make only one binary search in this case O(10^5) and it then directly goes to print answer.
I think it might be 0.36 sec @lohit_97 Yep, just wondering if there is a case similar to this in official test cases!

1 Like

@anushi Nope. It’s 36 seconds. Observe that in the worst case, you are using O(N*logN) per query. So It’s O(Q*N*logN). The only advantage in your solution is that you are breaking the loop if you find mismatches which are greater than 2. So that gives some optimization. But in the given test, both the sub arrays are same. So no mismatches so your code runs till the length of given subarrays. So O(N) per query.

You can check your code with given test to see your self.

1 Like

See the optimization is done using bitset and in this test case it for loop runs only once & it then prints the answer.
If you have any doubt in how many times the loop has run you can put a counter & check its value!