Let’s decide bits of the answer going from the largest to the smallest one. We will try to check if we can set current bit equal to 1, for this we need to solve the following subproblem: can we divide our array into K subarrays such that AND of all maximums is supermask of some number X (X is equal to the current value of the answer)?

Actually, AND of all maximums is supermask of some number X iff each maximum is supermask of this number. So, our subproblem is equivalent to the following one:

Suppose that we have an array and some numbers are special. Can we divide our array into K subarrays so that maximum on each subarray is special?

We can notice that if answer is true for some value of K then it’s also true for all smaller values of K(since we can just merge adjacent subarrays, maximum in the resulting subarray still will be special).

This leads us to the solution with the dynamic programming: dp[r] will be equal to the maximum value of K, such that we can divide prefix of size r to K subarrays, such that each maximum is special(if we can’t do it for any value of K, then dp[r] = -\infty.

Naive solution will be to iterate over last subarray:

dp[r] = {max(dp[l - 1] + 1)} over all l such that max(a_l \ldots a_r) is special.

To optimize it, let’s maintain stack where we will store current suffix maximums, segments to which they correspond and maximum value of dp[l - 1] over all l that belong to this segment. Also, we need to store maximum over all value of dp[l - 1] such that maximums are special.

When we go from r to r + 1 we need to pop some values from the top(merging information for this segments).

We can notice that to quickly update information about global maximum value of dp[l - 1], such that maximum of a_l \ldots a_r is special, we don’t need to store maximums for segments in set(or something similar) ~— we are always deleting and merging segments from the top, so for each segment we can just store maximum on the prefix (structure is pretty similar to the problem of maintaining maximum and adding/deleting number from the top).

So in the end, we are able to solve subproblem in O(n), which results in O(Nlog(maxA)).

Since it was hard to cut O(Nlog(maxA) from O(Nlog(N)log(maxA), we decided to allow it, so you don’t need last observation(also you can recompute maximum of dp over some subsegment using segment tree).