PROBLEM LINK:
Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4
Author: ddpigeon, kiwi26
Tester: sushil2006
Editorialist: iceknight1093
DIFFICULTY:
Easy
PREREQUISITES:
None
PROBLEM:
There’s an array A of length N, which you can rearrange however you like.
After rearrangement, you can perform the following operation:
- Select a subarray of A that appears at least twice and contains only positive elements.
Then, replace some (at least two) disjoint occurrences of this subarray in A with -1.
Find the minimum possible final size of A.
EXPLANATION:
Since we can freely rearrange elements beforehand, the only thing that matters is the frequency of each element.
Let \text{freq}[x] denote the number of occurrences of x we have.
Note that if \text{freq}[x] = 1, then x can never be “compressed” - after all, it’s impossible for there to be a subarray containing x that appears multiple times (this would imply multiple occurrences of x).
So, let’s ignore all elements with a frequency of 1, and compress only the remaining elements (this is doable by, say, placing all frequency 1 elements at the back of the array).
If there are K such elements, we just add K to the answer after compressing everything else.
It turns out that the compression scheme with us is quite efficient: given that we’re dealing with elements having a frequency of at least 2, it’s always possible to compress the array down to a size of at most 5.
Proof
The construction is fairly simple.
First, look at only the elements with odd frequency - there are \geq 3 of each of them, so take three occurrences of each and form three equal subarrays out of these occurrences.
These will be compressed into three elements.Now, all elements have a frequency that’s either 1 or even; and all the even numbers can be paired up and placed into two subarrays - which lets us compress them into two elements.
This gives us five elements plus everything with frequency 1, as claimed.
We now need to check if doing better than 5 is possible, meaning anything from 1 to 4.
Since the answer is \leq 5, there will also be at most 5 occurrences of #
in the final string. Let’s try each possibility:
- One occurrence of -1.
This is not possible, each time a compression operation is performed we create at least two of these. - Two occurrences of -1.
This means some subarray must’ve appeared exactly twice.
If x has even frequency, all occurrences of x can be placed into these two subarrays and compressed; while if x has odd frequency, one occurrence will be remaining.
So, the final size will be 2 plus the number of elements with odd frequencies. - Three occurrences of -1.
The only way to attain this is by compressing a single subarray that appears thrice.
Just as in the previous case, elements can be taken in triples till there are less than three left, at which point these can’t be compressed anymore and must be left as-is.
So, the final size will be three, plus the sum of frequencies modulo 3. - Four occurrences of -1.
To attain this, we either compress one subarray that occurs four times, or two different subarrays that occur twice each.
In either case, it’s not hard to see that a bit of rearrangement lets us make such a situation into a single larger subarray that appears twice (encompassing the same set of elements), and compressing that to obtain two#
's is strictly better.
So, this case can be ignored entirely.
In the end, we have only four possible candidates for the answer:
- 5+K, where K is the number of elements with frequency 1.
- 2+\sum_{x} (\text{freq}[x] \bmod 2)
- 3+\sum_{x} (\text{freq}[x] \bmod 3)
- N itself, by not doing any operation.
The final answer is just the minimum of these four.
Note that the second and third expressions automatically take into account the frequency 1 elements.
TIME COMPLEXITY:
\mathcal{O}(N) per testcase.
CODE:
Author's code (C++)
#include <bits/stdc++.h>
#define ll long long
#define ull unsigned long long
#define ld long double
#define fast_io \
ios_base::sync_with_stdio(false); \
cin.tie(0); \
cout.tie(0);
#define mod 1000000007
#define inf 1e18
#define F first
#define S second
#define pb push_back
#define mp make_pair
#define vi vector<int>
#define vl vector<ll>
#define vii vector<pair<int, int>>
#define vll vector<pair<ll, ll>>
#define all(a) a.begin(), a.end()
#define nl '\n'
using namespace std;
//----------------------DEBUGGER----------------------------------------------------------
#define dbg(x) cout << #x <<" "; _print(x); cout << endl;
void _print(ll t) {cout << t;}
void _print(int t) {cout << t;}
void _print(string t) {cout << t;}
void _print(char t) {cout << t;}
void _print(ld t) {cout << t;}
void _print(double t) {cout << t;}
void _print(ull t) {cout << t;}
template <class T> void _print(vector<vector<T>> v) {for(int j =0;j<v.size();j++){cout <<nl<<"[ "; for (T i : v[j]){_print(i); cout << " ";} cout << "]";};}
template <class T, class V> void _print(pair <T, V> p);
template <class T> void _print(vector <T> v);
template <class T> void _print(set <T> v);
template <class T, class V> void _print(map <T, V> v);
template <class T> void _print(multiset <T> v);
template <class T, class V> void _print(pair <T, V> p) {cout << "{"; _print(p.F); cout << ","; _print(p.S); cout << "}";}
template <class T> void _print(vector <T> v) {cout << "[ "; for (T i : v) {_print(i); cout << " ";} cout << "]";}
template <class T> void _print(set <T> v) {cout << "[ "; for (T i : v) {_print(i); cout << " ";} cout << "]";}
template <class T> void _print(multiset <T> v) {cout << "[ "; for (T i : v) {_print(i); cout << " ";} cout << "]";}
template <class T, class V> void _print(map <T, V> v) {cout << "[ "; for (auto i : v) {_print(i); cout << " ";} cout << "]";}
template <class T>void _print(vector<vector<vector<T>>> v){for(int k =0;k<v.size();k++){_print(v[k]);}}
//----------------------------------------------------------------------------------------
bool compare(pair<ll, ll> p1, pair<ll, ll> p2)
{
return p1.second < p2.second;
}
/*
*/
void solve()
{
ll n;
cin>>n;
map<ll,ll> cnt;
for(int i =0;i<n;i++)
{
ll x;
cin>>x;
cnt[x]++;
}
// no threes only twos
ll ans1 = 0;
{
bool ok = false;
for(auto [it,freq] : cnt)
{
if(freq >= 2)
ok = true;
if(freq%2 == 1)
ans1++;
}
if(ok)
ans1 += 2;
}
// only threes no twos
ll ans2 = 0;
{
bool ok = false;
for(auto [it,freq] : cnt)
{
if(freq >= 3)
ok = true;
if(freq%3 != 0)
ans2+=freq%3;
}
if(ok)
ans2 += 3;
}
// both twos and threes
ll ans3 =0;
{
for(auto [it,freq] : cnt)
{
if(freq == 1)
{
ans3++;
}
}
ans3 += 5;
}
cout << min({ans1,ans2,ans3}) << nl;
}
int main()
{
fast_io;
cout << fixed;
cout << setprecision(10);
ll t;
cin>>t;
while(t--)
{
solve();
}
}
Editorialist's code (PyPy3)
for _ in range(int(input())):
n = int(input())
a = list(map(int, input().split()))
import collections
freq = collections.Counter(a)
ans = 5 + list(freq.values()).count(1)
x = 2
for y in freq.values(): x += y%2
ans = min(ans, x)
x = 3
for y in freq.values(): x += y%3
ans = min(ans, x)
ans = min(ans, n)
print(ans)