That was the problem that I was facing. This is what I did :

Instead of choosing the block size ‘b’ as sqrt(q), I tried to find a value that will minimise the complexity (this is what I do usually for choosing the best block size for problems, if the TL is strict). First, we will write down the expression for the time complexity in terms of the block size ‘b’.

Lets see what are all the operations that we will be doing. We have to use SOS DP for each of the blocks. The complexity for this part is (17 * q * n)/b. Then, we need to go through all the blocks for each of the elements, and see in which block it gets killed. The complexity of this part is n * (n/b). And, if a monster gets killed in some block, you need to go through each query in the block to find exactly where it gets killed. The complexity for this part is n * b. Overall the expression for the time complexity is :

(17 * q * n)/b + n * (n/b) + n * b. If you differentiate this expression with respect to ‘b’, and equate it to 0, you will find that b = sqrt(17 * q + n). With this as the block size, the number of operations will be around 2^28.

One thing to keep in mind is that you can never be sure of a solution getting TLE based on the number of operations that you have calculated. It also depends on the type of the operations. In case of this problem, I think it was fast because there were bitwise operations involved, they are fast. I wasn’t sure that my solution was going to get accepted, but luckily it did.

Another thing is that sometimes, the worst case that you calculate might never be encountered (reasons might be you calculating the complexity incorrectly or the test cases being weak). This happened to me when I was trying to solve this problem:https://www.codechef.com/JAN17/problems/DIGITSEP/, The complexity I calculated should have given me a TLE for sure, but I was shocked to find that it was the second fastest submission for the problem. So, when in doubt, just submit and see.