PROBLEM LINK:
DIFFICULTY:
Simple
PREREQUISITES:
Ad-hoc, Two-pointers
PROBLEM:
Given an array with values from 1 to K, find the maximum length of a subarray which does not contain all K values.
QUICK EXPLANATION 1
Iterate over the right index of the subarray. If we maintain the last index each value occurred while iterating, we can calculate the answer for each right index efficiently.
QUICK EXPLANATION 2
We can use two-pointers, where for each l, we maintain the maximum r of a subarray which satisfies the constraints, along with the frequencies of each value in the subarray and the number of values in the subarray.
EXPLANATION 1:
Let’s simplify the problem slightly: We want to find the maximum length of a subarray, with the additional constraint that the subarray has to end at index r.
What is the condition that a subarray [l, r] contains the element x? Let last_x be the index of the rightmost occurrence of x less than or equal to r. [l, r] will contain x if and only if l \le last_x.
So, in order to make sure [l, r] contains less than K values, we just needs to make sure that l > last_y for some y (then element y won’t be in the subarray). This condition can be rewritten as l > \min(last_i).
This means that we can set l to \min(last_i)+1. The maximum length of a subarray ending at r is then r-(\min(last_i)+1)+1=r-\min(last_i).
Now, in order to find the answer for the entire array, we just need to iterate through all r and find the maximum answer from all r. In order to make the algorithm efficient, we need to do the following two things efficiently:
- Updating last as we iterate over r.
- Finding \min(last_i).
Updating last is easy: at the start of each iteration, we set last_{A_r} to r, and the other values of last don’t change.
In order to find \min(last_i), we should store all last_i in a data structure which supports adding/removing elements and finding the minimum element (like set in C++).
This works in O(n \log n) or O(n) depending on implementation.
EXPLANATION 2:
Notice that if subarray [l, r] satisfies does not have all K values, then subarray [l+1, r] also does not have all K values. Thus, if we iterate over l from left to right, r will only increase. This is a sign that we can use two-pointers.
A general template for two-pointers looks like:
for(int l=0, r=0; l<n; ++l) {
while(we can add element r)
add element r;
ans=max(ans, r-l);
remove element l;
}
Our trouble lies in checking if we can add element r. We can do it naively in O(n) time (by checking the values in [l, r] directly), but we want something which will work in constant time.
In order to do that, we should maintain some information about our current subarray [l, r-1]. Specifically, we should maintain:
- c_i, which is the count of value i in the subarray.
- c2, which is the number of distinct values which appear in the subarray.
We can maintain c_i pretty easily when we add/remove an element, as shown below:
for(int l=0, r=0; l<n; ++l) {
while(we can add element r) {
//add element r
++c[r];
}
ans=max(ans, r-l);
//remove element l
--c[l];
}
For c2, there are different cases for adding or removing an element.
When we add an element, we check if that element has appeared before in the subarray, and if it hasn’t, we increment c2.
When we remove an element, we check if that element will disappear in the subarray, and if it will, we decrement c2.
The updated code is shown below:
for(int l=0, r=0; l<n; ++l) {
while(we can add element r) {
//add element r
if(c[r]==0)
++c2;
++c[r];
}
ans=max(ans, r-l);
//remove element l
--c[l];
if(c[l]==0)
--c2;
}
Finally, let’s see how we can check if we are able to add element r. As long as c2 < k after adding, element r can be added. The exact condition is included below:
for(int l=0, r=0; l<n; ++l) {
while(r+1<n&&(c[r]>0||c2<k-1)) {
//add element r
if(c[r]==0)
++c2;
++c[r];
}
ans=max(ans, r-l);
//remove element l
--c[l];
if(c[l]==0)
--c2;
}
This solution works in O(n).
SOLUTIONS:
Setter's Solution
#include <iostream>
#include <algorithm>
#include <string>
#include <assert.h>
using namespace std;
long long readInt(long long l,long long r,char endd){
long long x=0;
int cnt=0;
int fi=-1;
bool is_neg=false;
while(true){
char g=getchar();
if(g=='-'){
assert(fi==-1);
is_neg=true;
continue;
}
if('0'<=g && g<='9'){
x*=10;
x+=g-'0';
if(cnt==0){
fi=g-'0';
}
cnt++;
assert(fi!=0 || cnt==1);
assert(fi!=0 || is_neg==false);
assert(!(cnt>19 || ( cnt==19 && fi>1) ));
} else if(g==endd){
assert(cnt>0);
if(is_neg){
x= -x;
}
assert(l<=x && x<=r);
return x;
} else {
assert(false);
}
}
}
string readString(int l,int r,char endd){
string ret="";
int cnt=0;
while(true){
char g=getchar();
assert(g!=-1);
if(g==endd){
break;
}
cnt++;
ret+=g;
}
assert(l<=cnt && cnt<=r);
return ret;
}
long long readIntSp(long long l,long long r){
return readInt(l,r,' ');
}
long long readIntLn(long long l,long long r){
return readInt(l,r,'\n');
}
string readStringLn(int l,int r){
return readString(l,r,'\n');
}
string readStringSp(int l,int r){
return readString(l,r,' ');
}
int T;
int n,k;
int arr[100100];
int sm_n=0;
int col[100100];
int main(){
//freopen("01.txt","r",stdin);
//freopen("01o.txt","w",stdout);
T=readIntLn(1,1000);
while(T--){
n=readIntSp(1,100000);
k=readIntLn(2,100000);
sm_n+= n;
assert(sm_n<=1000000);
for(int i=0;i<n;i++){
if(i==n-1){
arr[i]=readIntLn(1,k);
} else{
arr[i]=readIntSp(1,k);
}
}
for(int i=1;i<=k;i++){
col[i] = -1;
}
int mx=1;
for(int i=0;i<n;i++){
mx = max(mx,i-col[arr[i]] - 1);
col[arr[i]]=i;
}
for(int i=1;i<=k;i++){
mx=max(mx,n-col[i]-1);
}
cout<<mx<<endl;
}
assert(getchar()==-1);
}
Tester's Solution
#include <bits/stdc++.h>
#include <vector>
#include <set>
#include <map>
#include <string>
#include <cstdio>
#include <cstdlib>
#include <climits>
#include <utility>
#include <algorithm>
#include <cmath>
#include <queue>
#include <stack>
#include <iomanip>
#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
//setbase - cout << setbase (16); cout << 100 << endl; Prints 64
//setfill - cout << setfill ('x') << setw (5); cout << 77 << endl; prints xxx77
//setprecision - cout << setprecision (14) << f << endl; Prints x.xxxx
//cout.precision(x) cout<<fixed<<val; // prints x digits after decimal in val
using namespace std;
using namespace __gnu_pbds;
#define f(i,a,b) for(i=a;i<b;i++)
#define rep(i,n) f(i,0,n)
#define fd(i,a,b) for(i=a;i>=b;i--)
#define pb push_back
#define mp make_pair
#define vi vector< int >
#define vl vector< ll >
#define ss second
#define ff first
#define ll long long
#define pii pair< int,int >
#define pll pair< ll,ll >
#define sz(a) a.size()
#define inf (1000*1000*1000+5)
#define all(a) a.begin(),a.end()
#define tri pair<int,pii>
#define vii vector<pii>
#define vll vector<pll>
#define viii vector<tri>
#define mod (1000*1000*1000+7)
#define pqueue priority_queue< int >
#define pdqueue priority_queue< int,vi ,greater< int > >
#define flush fflush(stdout)
#define primeDEN 727999983
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
// find_by_order() // order_of_key
typedef tree<
int,
null_type,
less<int>,
rb_tree_tag,
tree_order_statistics_node_update>
ordered_set;
int a[123456];
int main(){
std::ios::sync_with_stdio(false); cin.tie(NULL);
int t;
cin>>t;
while(t--){
int n,k;
int i;
cin>>n>>k;
rep(i,n){
cin>>a[i];
}
i=0;
int j=0,maxi=0;
int cnt=0;
map<int,int> mapi;
while(j<n){
if(mapi[a[j]]==0 && cnt==k-1){
mapi[a[i]]--;
if(mapi[a[i]]==0)
cnt--;
i++;
continue;
}
mapi[a[j]]++;
if(mapi[a[j]]==1)
cnt++;
j++;
maxi=max(j-i,maxi);
}
cout<<maxi<<endl;
}
return 0;
}
Editorialist's Solution 1
#include <bits/stdc++.h>
using namespace std;
int n, k, a[100000];
void solve() {
//input
cin >> n >> k;
map<int, int> last;
for(int i=0; i<n; ++i) {
cin >> a[i];
//at first all elements have not appeared
last[a[i]]=-1;
}
//check if all k values appear
if(last.size()<k) {
cout << n << "\n";
return;
}
//store all last occurrences in set
set<array<int, 2>> s;
for(auto p : last)
s.insert({p.second, p.first});
//fix right end of subarray
int ans=0;
for(int i=0; i<n; ++i) {
//update last occurrence for a[i]
s.erase({last[a[i]], a[i]});
last[a[i]]=i;
s.insert({last[a[i]], a[i]});
//max subarray
ans=max(i-(*s.begin())[0], ans);
}
cout << ans << "\n";
}
int main() {
ios::sync_with_stdio(0);
cin.tie(0);
int t;
cin >> t;
while(t--)
solve();
}
Editorialst's Solution 2
#include <bits/stdc++.h>
using namespace std;
int n, k, a[100000], c[100000];
void solve() {
//input
cin >> n >> k;
for(int i=0; i<n; ++i)
cin >> a[i], --a[i];
//make sure k is not too large
if(k>n) {
cout << n << "\n";
return;
}
int ans=0;
//count of values in subarray
int c2=0;
for(int l=0, r=0; l<n; ++l) {
//check if we can extend r
//make sure that c2 < k after extending
while(r<n&&(c[a[r]]||c2<k-1)) {
if(!c[a[r]])
++c2;
++c[a[r]];
++r;
}
//update ans
ans=max(r-l, ans);
//element l is removed
--c[a[l]];
if(!c[a[l]])
--c2;
}
cout << ans << "\n";
}
int main() {
ios::sync_with_stdio(0);
cin.tie(0);
int t;
cin >> t;
while(t--)
solve();
}
HIDDEN TRAPS:
- It is not guaranteed that sum of K over all test cases is small, so your code could TLE (depending on implementation) if it works in O(N+K) per test case.
Please give me suggestions if anything is unclear so that I can improve. Thanks