PROBLEM LINK:
Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4
Author: pols_agyi_pols
Tester: kingmessi
Editorialist: iceknight1093
DIFFICULTY:
Simple
PREREQUISITES:
None
PROBLEM:
You’re given an array A.
Define the score of a subarray to be the maximum frequency of an element within it.
The score of a partition of A into subarrays equals the sum of scores of each subarray in the partition.
Compute the number of distinct possible scores across all partitions of A into subarrays.
EXPLANATION:
Let’s first try to find some bounds on what the score of a partition can possibly be.
The score of a subarray cannot exceed the length of the subarray; because the maximum frequency inside it is bounded above by its length.
So, the score of any partition of A into subarrays, is bounded above by the sum of lengths of the subarrays - which is just N.
Thus, we see that the score of any partition must be \le N.
In particular, note that it’s always possible to achieve a score of exactly N, by simply splitting the array into N subarrays of length 1, i.e. each element is its own subarray.
Now that we have an upper bound, let’s try to find a lower bound.
Since the upper bound was obtained by taking each element as a singleton, we can try the opposite instead: what if the entire array is taken as a single subarray?
If we do this, the score of the partition is simply the maximum frequency of some element in the array.
Let this value be M.
It can be proved, fairly easily, that any partition has a score of at least M.
Proof
Let x be an element with maximum frequency (if there are multiple, just choose any one of them.)
Now, for any partition, observe that the score of any subarray is at least the number of occurrences of x within it.
This in turn means that the total score is at least the total number of occurrences of M in the array, which is just M.Thus, M is a lower bound on the score.
We now know that the minimum possible score is M, and the maximum possible score is N.
The natural thing to do at this point is to check if any score between M and N can be attained - and indeed, it turns out that it can!
Proof
Let’s start with the entire array as a single subarray.
This has a score of M.Now, consider what happens when we separate out the first element into its own subarray, while the remaining N-1 elements form a single subarray.
There are two possibilities:
- The largest frequency in the remaining N-1 elements is still M.
In this case, the new partition has a score of 1+M, since the singleton subarray containing the first element has a score of exactly 1 always.- The largest frequency in the remaining N-1 elements is \lt M.
In this case, since we removed only one element, the largest frequency must be M-1.
So, the score of the partition is 1 + (M-1) = M.Thus, no matter what, the score of the newly created partition is either M or M+1, i.e. it increased by at most one.
Now simply repeat this process by separating out the second element into its own subarray, then the third element, and so on.
Each such change will cause the score to increase by at most 1.
However, we eventually reach a state where all N elements form singleton subarrays, making the score N.Since the score started at M and reached N, while increasing by at most 1 each step, it’s impossible to skip any value in [M, N]. Thus, we would’ve had a configuration with each such score at some point or another, thus proving our claim.
So, the number of distinct possible scores is just the length of the interval [M, N], which equals N-M+1.
TIME COMPLEXITY:
\mathcal{O}(N\log N) per testcase.
CODE:
Editorialist's code (PyPy3)
for _ in range(int(input())):
n = int(input())
a = list(map(int, input().split()))
import collections
freq = collections.Counter(a)
mx = max(freq.values())
print(n-mx+1)