UNQMODE - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: grayhathacker
Testers: iceknight1093, rivalq
Editorialist: iceknight1093

DIFFICULTY:

2804

PREREQUISITES:

Observation

PROBLEM:

You are given an array A, each of whose elements has frequency \leq 100.

Count the number of subarrays whose mode is unique.

EXPLANATION:

Each element has frequency \leq 100, which means the mode of any subarray also has a frequency of at most 100.

Let’s fix the right endpoint R of our subarray, and the frequency of the mode k. Let’s try to compute the number of L such that subarray [L, R] has a unique mode with frequency k.

The major observation to be made is that the set of valid L form a contiguous sequence, i.e, the set of valid L is \{x, x+1, \ldots, y-1, y\} for some x and y; so let’s instead try to find x and y.

Proof

Suppose we start L at R and keep decreasing it. As we move from subarray [L, R] to [L-1, R], the frequency of the mode can never decrease; and for a fixed frequency, the number of modes also can never decrease.

So, y is the first time the frequency of the mode is k, and x is the last time the frequency is k and there is only a single mode.

Finding x and y in \mathcal{O}(N) each is not too hard (but of course, too slow). When iterating in reverse down from R:

  • y is the first index at which some element occurs exactly K times.
  • x-1 is the higher of:
    • The first index at which some element occurs exactly K+1 times (increasing the frequency of the mode); or
    • The second index at which some element occurs exactly K times (increasing the number of modes)

To improve this, look at what information exactly we need.
We need to know the first time something occurs K times, the first time something occurs K+1 times, and the second time something occurs K times. We need this information for every 1 \leq K \leq 100.

So, let’s try to maintain this information. The array is treated as 1-indexed here.

  • Let \text{first}[K] denote the first time \leq R that an element occurs K times.
  • Let \text{second}[K] denote the second time \leq R that an element occurs K times.
  • For convenience, let’s say \text{first}[K] and \text{second}[K] are 0 if no valid position exists.

When we move from R to R+1, this information changes as follows:

  • For elements other than A_{R+1}, there is of course no change in their closest positions; so it suffices to look at the last 100 occurrences of A_{R+1} alone.
  • Consider the K-th previous occurrence of A_{R+1}. Suppose this is position u.
    • If u \gt \text{first}[K], then set \text{second}[K] \gets \text{first}[K] and \text{first}[K] \gets u
    • Otherwise, set \text{second}[K] \gets \max(\text{second}[K], u)
  • This allows us to update the \text{first} and \text{second} arrays in 100 operations.
  • Now, for a fixed K, we have y = \text{first}[K] and x = 1 + \max(\text{first}[K+1], \text{second}[K]), after which we simply add y-x+1 to the answer.

The above algorithm only needs us to find the previous 100 occurrences of A_{R+1} quickly, which is easy if we create a list of positions corresponding to each value.

This gives us a solution in \mathcal{O}(N\log N + 100\cdot N) or \mathcal{O}(100\cdot N \log N), depending on implementation.

TIME COMPLEXITY:

\mathcal{O}(N\cdot 100) per testcase.

CODE:

Setter's code (C++)
#include<bits/stdc++.h>
using namespace std;
 
#define mod 1000000007
typedef set<string> ss;
typedef vector<int> vs;
typedef map<int, char> msi;
typedef pair<int, int> pa;
typedef long long int ll;
 
ll n, i, last_pos[100005], lp, a[100005], j, ans, k=100;
map<ll, ll> pos;
pair<ll, ll> v[100005];
int main()
{
	ios_base::sync_with_stdio(false);
	cin.tie(0);
 
	int t;
	cin >> t;
	while (t--)
	{
		cin >> n;
		for (i = 0; i < n; i++)
		{
			cin >> a[i];
			pos[a[i]] = -1;
		}
		for (i = 2; i <= k + 1; i++)
		{
			v[i] = { -1, -1};
		}
		for ( i = 0; i < n; i++)
		{
			last_pos[i] = pos[a[i]];
			pos[a[i]] = i;
		}
		ans = n;
		for (i = 0; i < n; i++)
		{
			lp = last_pos[i];
			for (j = 2; j <= k; j++)
			{
				if (lp == -1)
					break;
				if (lp > v[j].second)
				{
					v[j].first = v[j].second;
					v[j].second = lp;
				}
				else if (lp > v[j].first)
				{
					v[j].first = lp;
				}
				lp = last_pos[lp];
			}
			for (j = 2; j <= k; j++)
			{
				ans += v[j].second - max(v[j].first, v[j + 1].second);
			}
		}
		cout << ans << "\n";
	}
 
	return 0;
}
Editorialist's code (Python)
for _ in range(int(input())):
	n = int(input())
	a = list(map(int, input().split()))

	ans = 0
	pos = {}
	nearest = [-1]*105
	nearest2 = [-1]*105
	for i in range(n):
		if a[i] not in pos: pos[a[i]] = [i]
		else: pos[a[i]].append(i)
		cur = pos[a[i]]
		for j in range(1, 101):
			if len(cur) >= j:
				u = cur[-j]
				if u >= nearest[j]:
					nearest2[j] = nearest[j]
					nearest[j] = u
				else: nearest2[j] = max(u, nearest2[j])
			if nearest[j] != -1:
				L = max(nearest2[j], nearest[j+1])+1
				ans += nearest[j] - L + 1
	print(ans)

Was O(100NlogN) solution meant to pass? If so, the time constraints are extremely strict, because I could not get it to pass even after optimisation. In fact, I still cannot get it to pass.

Edit: Finally got it to pass after removing one instance of set :: find with a visited array. TCs should be have been less strict if this solution was meant to pass.

1 Like

There exists O(N*logN+100*N) solutions which take 0.1s, so the 1s time limit is quite loose.

Yeah but my solution had an extra log(n) factor.

There is nothing wrong with the equations above make sure to carry the x in the algebra equation

The intended solution is \mathcal{O}(N\log N + 100N). I think it should be clear that \mathcal{O}(100N\log N) is already more than 10^8 operations, so you should only be coding it if it has a low constant factor, for example if the extra log factor comes from something like a simple binary search (and I don’t mean set::lower_bound)

map and set operations are rather expensive, so your code is more than 10^8 operations with bad constant factor; I certainly wouldn’t expect that to pass within a second.

Lowering constraints or increasing TL were also unfortunately not options because there’s a bruteforce solution in \mathcal{O}(N^2) which would end up passing instead.

2 Likes