Query regarding Mo's Algorithm

dishant_18 · December 13, 2017, 1:09am

Hello guys… Recently I was studying Mo’s Algorithm from this blog and trying problems stated at the end. I was trying the problem Powerful Array. I saw that my solution barely passes the Time limit and that too I don’t know how as my same solution got TLE!
I saw that the blog clearly states that Mo’s Algorithm is the best way for this problem. It would be great if someone can point where I am going wrong. Any better approach or articles from which I can learn are also appreciated!! Thanks

Here’s a link to my AC code : LINK
Same code getting TLE : LINK

manaranjanfav · December 14, 2017, 12:10am

i submitted it with block size 500 it passes the tle

sdssudhu · December 14, 2017, 12:45am

The point is the previous code passed in 4990 ms which is very close to time limit and it depends on luck in these cases.

sdssudhu · December 14, 2017, 1:22pm

@dishant_18 I resubmitted your solution with changing the data type of input array as int instead of long long and passed in 4648ms which is an improvement. There are optimisations like these which can use to spped up your code.
I didn’t read through your code fully so there might be algorithmic optimizations in your logic.

My submission

Edit: I slightly modified your update functions(add and remove) and got running time of 3742 ms.

Submission 2

amulyagaur111 · March 30, 2018, 2:15pm

I have used min no. of long long variables and efficient comparator function.

This passes in 2526ms only!

dishant_18 · December 13, 2017, 11:53pm

Can anyone help pls??

dishant_18 · December 14, 2017, 12:44pm

Yea that’s what I mean… Is there a concrete solution with Mo’s Algorithm??

dishant_18 · December 14, 2017, 12:47pm

Thank you for the reply but pls check @sdssudhu’s comment… That’s exactly what I mean!

meooow · December 14, 2017, 3:12pm

Using left shift to multiply by 2 would probably further decrease the time.
Also @dishant_18 you might try varying the block size.

One more optimization: When sorting the queries, sort the right endpoints by ascending or descending depending on whether the left endpoint is in an odd or even block. This saves a lot of travel steps for the right pointer. I cannot say exactly where I found it, but I remember it was some Codeforces comment.

UPDATE: Solution using the above optimizations takes a little over 1 sec: 33239378

dishant_18 · December 14, 2017, 9:54pm

Thanks @meooow and @sdssudhu
Btw @meooow I learnt about this implementation from this blog(Mo's algorithm | HackerEarth).
Is the sorting operation you mentioned always better??
And what exactly do you mean by varying block size… Sorry if its too trivial but I don’t get how much offset shall I give and how does that help?

vijju123 · December 14, 2017, 11:06pm

Varying block sizes is…well XD. Let me illustrate this with an example. This long, that Chef and Xor queries.

Block size - \sqrt{N} = Gave me TLE.

Block size - 500 = TLE
Block size - 1000= TLE
Block size 2000 = TLE
Block size 800= Accepted at ~2.2seconds (had 1 good second left).

So yeah…:/. You cant do that in a CF contest though, since only 1 submission is allowed. You should, instead, focus on optimizations as @meooow mentioned.

dishant_18 · December 15, 2017, 1:42am

@vijju123 can you explain how does the block size effect time limit??
I can’t get it somehow!

meooow · December 15, 2017, 2:09am

From the math you can work out that the block size should be in the order of \sqrt N, but that does not mean exactly \sqrt N is the best. The best block size practically is k \sqrt N for some k, usually not very far from 1.
But as @vijju123 said, tuning the block size is not practical for short contests. You would either have to make multiple submissions or locally generate trial test data.

meooow · December 15, 2017, 2:11am

However the sorting order optimization always works. If you sort query endpoint in ascending order, for the first block the right pointer moves from start to end, then for next block it moves back to start and then back to end, then again back to start and back to end.
But if you alternate the sorting order as ascending/descending, then the pointer goes start to end for one block, and end to start for the next, and again start to end for the next. So nearly half as many steps are required.

PS: I’m using the terms “start” and “end” loosely here, but hopefully you understand what I’m saying.

vijju123 · December 15, 2017, 12:39pm

@dishant_18 - have a look at 2 scenarios.

What if the test cases have all queries such that difference between L and R is huge? Which performs better, block size of 1000 or block size of 100 ? (Hint: Greater block size means I will waste lesser iterations in jumping blocks to reach final R)

What if test cases are like, difference between L and R is not that big? Large block size can be inefficient, as if difference is less than the block size, then you are back to brute force. So a smaller block size helps here.

dishant_18 · December 15, 2017, 5:55pm

Got it… Thanks @vijju123 and @meooow