I used binary search and a DSU to create another O(N^2 \log N) solution.
First, get all of the array elements array[]
.
Loop from i=0
to i=N-1
and make a struct with .content = array[i]
and .index = i
. Then, insert this struct into sortedArray[]
using binarySearch()
to figure out how to put it so that sortedArray[]
will be sorted by .content
and by .index
if two elements have the same .content
.
Now, loop from j = i+1
to j = N-1
. Using sortedArray[]
and binarySearch()
, find isRepeat[i][j]
, which is true if and only if array[j] == array[k]
for some k
in [0, i]
.
This is the first part and is done in O(N^2\log N) because the loop and inner loop has O(N^2) and the binary search is O(\log N).
For the second part, we have a global answer
which we will print. Create a new loop from i=0
to i=N-2
. Then, in an inner loop, go from j=0
to j=i
. Now, we have a local tempAnswer
which will be reset only when we’ve finished the inner loop of j
.
Now, in the case that j=0
, we need to initialize the DSU, which we’ll call safeIndicies
. We start with tempAnswer = 0
. We’re also going to build an array called safes[]
which breaks the interval [i+1, length-1]
into blocks of “safety zones” in which there are no repeat elements with the interval [0, i]
. There are safesLength
safety zones. The i
th element of safes[]
tells us the length of the i
th safety zones. Note that safety zones can have length 0
. Now, we have an inner loop from k=i+1
to k=length-1
. If isRepeat[i][k]
, then we know that the array[k]
is a repeat with [0, i]
, so the current safety zone ends here. This means we add an element to safes[]
and add the index of this new safety zone in safes[]
into the DSU safeIndicies
. We also store the index of this new safe in mappingToSafeIndex[k]
so we can find it through k
later on. Otherwise, in the case that isRepeat[i][k]
is false, we know there is no repeat here, so we simply increment the length of the current safety zone. Lastly, whenever we create a new safety zone with length numSafes
, there are numSafes*(numSafes+1)/2
subarrays of that safety zone, all of which are disjoint with [0, i]
. Therefore, we increase tempAnswer
by numSafes*(numSafes+1)/2
when we create a new safety zone.
Notice how this inner loop of k
only runs for j=0
. If it ran for all j=0
to j=i
, then this would be a O(N^3) solution, but we do something different for j=1
to j=i
, so this loop of k
only runs from i=0
to i=N-2
and j=0
, meaning it is O(N^2).
At this point, you’re likely wondering what the DSU is for. The DSU will allow us to join two safety zones together once we’ve taken an element out of [0, i]
and thus some array[k]
are no longer repeats with the intervals, so we don’t need to break those safety zones up anymore.
In the case that j
is greater than 0
, we’re taking array[j-1]
out of the interval so we just have the interval [j, i]
. Now, we run binarySearch()
to find the element in sortedArray[]
that comes after .content = array[j-1]
and .index = j-1
. If this element has a different .content
than array[j-1]
, then we know that there are no elements in the array that is equal to array[j-1]
, so array[j-1]
is not causing any repeats. Therefore, tempAnswer
does not change and we move on. If this element has the same .content
, but .index
less than i
, then we know that there is an element equal to array[j-1]
inside the interval [j, i]
, so any repeats caused by array[j-1]
will still persist by that other element in [j, i]
. Otherwise, we are in the case that when we take out array[j-1]
, tempAnswer
changes because there are repeats with array[j-1]
which won’t apply to
In order to account for the changes of taking out array[j-1]
, we need to join the two safety zones blocked by all of the elements that are equal to array[j-1]
. In order to do this, we loop through all of the elements in sortedArray[]
starting from the one after .content=array[j-1]
and .index=j-1
and then stopping when .content
is no longer array[j-1]
. We will refer the index of this element in sortedArray[]
as k
. Since sortedArray[]
is sorted by .index
and because of the checking we did above, we can be sure that all of these elements have .index
greater than i
and thus are blocking a safety zone. We find this safety zone with mappingToSafeIndex[sortedArray[k].index]
. Now, we need to join the safety zones mappingToSafeIndex[sortedArray[k].index]
, which we will say has length oldSafe
, and mappingToSafeIndex[sortedArray[k].index]+1
, which we will say has length newSafe
. However, these safety zones may have been joined with other safety zones before, so we use the find()
function of the DSU safeIndicies
in order to find the parent safety zones of both of the concerning safety zones. Then, for both safety zones, we do tempAnswer -= numSafes*(numSafes+1)/2
because we are getting rid of the safety zones. After that, we join them with dsuUnion()
and then we say that this new safety zone has length oldSafe+newSafe+1
(the +1
is there to account for sortedArray[k]
itself, as this blocker was previously not counted in either safety zone). Then, we update the corresponding element in safes[]
with this new length and do tempAnswer += (oldSafe+newSafe+1)*(oldSafe+newSafe+2)/2
to account for all of the possible intervals of the new safety zone.
Notice how this different inner loop of k
only runs at most N-1
times throughout the whole j=1
to j=i
loop. This is because in one j=1
to j=i
loop, we can only join the two same safety zones once and there are at most N
safety zones, so this loop can only run N-1
times since that’s the number of pairs of consecutive safety zones. Thus, because this loop only runs O(N) times for j=1
to j=i
, for all i=0
to i=N-2
, it is only O(N^2) instead of O(N^3).
In both cases here, tempAnswer
represents the number of intervals. Finally, when we’re all done, we print answer
.
The second part here also takes O(N^2\log N) because the loop and inner loop has O(N^2) and the binary search takes O(\log N). Any DSU operations can effectively be ignored because with the DSU I used, they take O(\alpha(N)) which is very, very slow.
Thus, overall, this algorithm is O(N^2\log N) time using only binary search and a DSU, without a segment tree.