PROBLEM LINK:
Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4
Author: grayhathacker
Testers: iceknight1093, rivalq
Editorialist: iceknight1093
DIFFICULTY:
2804
PREREQUISITES:
Observation
PROBLEM:
You are given an array A, each of whose elements has frequency \leq 100.
Count the number of subarrays whose mode is unique.
EXPLANATION:
Each element has frequency \leq 100, which means the mode of any subarray also has a frequency of at most 100.
Let’s fix the right endpoint R of our subarray, and the frequency of the mode k. Let’s try to compute the number of L such that subarray [L, R] has a unique mode with frequency k.
The major observation to be made is that the set of valid L form a contiguous sequence, i.e, the set of valid L is \{x, x+1, \ldots, y-1, y\} for some x and y; so let’s instead try to find x and y.
Proof
Suppose we start L at R and keep decreasing it. As we move from subarray [L, R] to [L-1, R], the frequency of the mode can never decrease; and for a fixed frequency, the number of modes also can never decrease.
So, y is the first time the frequency of the mode is k, and x is the last time the frequency is k and there is only a single mode.
Finding x and y in \mathcal{O}(N) each is not too hard (but of course, too slow). When iterating in reverse down from R:
- y is the first index at which some element occurs exactly K times.
-
x-1 is the higher of:
- The first index at which some element occurs exactly K+1 times (increasing the frequency of the mode); or
- The second index at which some element occurs exactly K times (increasing the number of modes)
To improve this, look at what information exactly we need.
We need to know the first time something occurs K times, the first time something occurs K+1 times, and the second time something occurs K times. We need this information for every 1 \leq K \leq 100.
So, let’s try to maintain this information. The array is treated as 1-indexed here.
- Let \text{first}[K] denote the first time \leq R that an element occurs K times.
- Let \text{second}[K] denote the second time \leq R that an element occurs K times.
- For convenience, let’s say \text{first}[K] and \text{second}[K] are 0 if no valid position exists.
When we move from R to R+1, this information changes as follows:
- For elements other than A_{R+1}, there is of course no change in their closest positions; so it suffices to look at the last 100 occurrences of A_{R+1} alone.
- Consider the K-th previous occurrence of A_{R+1}. Suppose this is position u.
- If u \gt \text{first}[K], then set \text{second}[K] \gets \text{first}[K] and \text{first}[K] \gets u
- Otherwise, set \text{second}[K] \gets \max(\text{second}[K], u)
- This allows us to update the \text{first} and \text{second} arrays in 100 operations.
- Now, for a fixed K, we have y = \text{first}[K] and x = 1 + \max(\text{first}[K+1], \text{second}[K]), after which we simply add y-x+1 to the answer.
The above algorithm only needs us to find the previous 100 occurrences of A_{R+1} quickly, which is easy if we create a list of positions corresponding to each value.
This gives us a solution in \mathcal{O}(N\log N + 100\cdot N) or \mathcal{O}(100\cdot N \log N), depending on implementation.
TIME COMPLEXITY:
\mathcal{O}(N\cdot 100) per testcase.
CODE:
Setter's code (C++)
#include<bits/stdc++.h>
using namespace std;
#define mod 1000000007
typedef set<string> ss;
typedef vector<int> vs;
typedef map<int, char> msi;
typedef pair<int, int> pa;
typedef long long int ll;
ll n, i, last_pos[100005], lp, a[100005], j, ans, k=100;
map<ll, ll> pos;
pair<ll, ll> v[100005];
int main()
{
ios_base::sync_with_stdio(false);
cin.tie(0);
int t;
cin >> t;
while (t--)
{
cin >> n;
for (i = 0; i < n; i++)
{
cin >> a[i];
pos[a[i]] = -1;
}
for (i = 2; i <= k + 1; i++)
{
v[i] = { -1, -1};
}
for ( i = 0; i < n; i++)
{
last_pos[i] = pos[a[i]];
pos[a[i]] = i;
}
ans = n;
for (i = 0; i < n; i++)
{
lp = last_pos[i];
for (j = 2; j <= k; j++)
{
if (lp == -1)
break;
if (lp > v[j].second)
{
v[j].first = v[j].second;
v[j].second = lp;
}
else if (lp > v[j].first)
{
v[j].first = lp;
}
lp = last_pos[lp];
}
for (j = 2; j <= k; j++)
{
ans += v[j].second - max(v[j].first, v[j + 1].second);
}
}
cout << ans << "\n";
}
return 0;
}
Editorialist's code (Python)
for _ in range(int(input())):
n = int(input())
a = list(map(int, input().split()))
ans = 0
pos = {}
nearest = [-1]*105
nearest2 = [-1]*105
for i in range(n):
if a[i] not in pos: pos[a[i]] = [i]
else: pos[a[i]].append(i)
cur = pos[a[i]]
for j in range(1, 101):
if len(cur) >= j:
u = cur[-j]
if u >= nearest[j]:
nearest2[j] = nearest[j]
nearest[j] = u
else: nearest2[j] = max(u, nearest2[j])
if nearest[j] != -1:
L = max(nearest2[j], nearest[j+1])+1
ans += nearest[j] - L + 1
print(ans)