BSTR - Editorial

iceknight1093 · November 26, 2025, 4:45pm

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: satyam_343
Tester: kingmessi
Editorialist: iceknight1093

DIFFICULTY:

Medium

PREREQUISITES:

Segment trees/fenwick trees

PROBLEM:

You’re given an array A of length 2N containing each element from 1 to N twice each.
A subsequence (x_1, x_2, \ldots, x_M) is valid if for each 1 \le i \le M-N+1, the values
\{A_{x_i}, A_{x_{i+1}}, \ldots, A_{x_{i+N-1}}\} are pairwise distinct.

Find the maximum possible length of a valid subsequence of A.

EXPLANATION:

First, it’s obvious that we can obtain a valid subsequence of length N, by just taking one copy of each element from 1 to N.
So, we only need to concern ourselves with valid subsequences of length \gt N.
In particular, such subsequences must have repeated elements (by the pigeonhole principle).

Now, consider some valid subsequence (x_1, x_2, \ldots, x_M).
Since it must satisfy the pairwise distinct condition for every sliding window of size N, we obtain the following:

A_{x_1} = A_{x_{N+1}}
A_{x_2} = A_{x_{N+2}}
A_{x_3} = A_{x_{N+3}}
\vdots

In general, A_{x_i} = A_{x_{i+N}} must hold, i.e. the elements of the subsequence must be periodic with a period of N.

Consider some element x.
Let its two occurrences be at L_x and R_x (where L_x \lt R_x).

Observe that if there exists some y such that y doesn’t appear between the occurrences of x, then it’s impossible for there to exist a valid subsequence containing both occurrences of x.
This is because for any subsequence containing both L_x and R_x, the length-N segment starting at L_x either cannot include y at all, or cannot include it without including the second copy of x, breaking distinctness.

In simpler words: it’s possible to choose both occurrences of x only if every other element has at least one appearance between the occurrences of x.

Let’s call values of x such that all other y appear at least once between L_x and R_x, good elements.
Observe that our goal is now to choose a subsequence that includes both occurrences of as many good elements as possible.

Note that finding all good elements is a classical problem of checking segment intersections, and can be done in \mathcal{O}(N\log N) using for example a segment tree/fenwick tree.

Let the good elements be g_1, g_2, \ldots, g_k.
Without loss of generality, suppose these are sorted in increasing order of L_{g_i}.
Then, observe that the following inequality must hold:

L_{g_1} \lt L_{g_2} \lt\ldots\lt L_{g_k} \lt R_{g_1} \lt R_{g_2} \lt\ldots \lt R_{g_k}

That is, the right endpoints will also be sorted in the same order; and every left endpoint will come before every right endpoint.

Now, it’s obvious that we can always choose both occurrences of one good element (if there are any good elements at all, of course); so the interesting case is when we want to pick more than one.

Suppose we want to pick both occurrences of g_{i_1}, g_{i_2}, \ldots, g_{i_r} where i_1 \lt i_2 \lt\ldots \lt i_r
Observe that only i_1 and i_r really matter here.
This is because, for every value other than these chosen r, it must have an occurrence in the range [L_{i_r}, R_{i_1}] - otherwise, a valid subsequence containing all these cannot exist.

In particular, observe that this is automatically satisfied for indices \lt i_1 (their right endpoints belong to the range), as well as indices \gt i_r (their left endpoints).
Indices in [i_1, i_r] other than the endpoints don’t really affect things; so we can just take all of them since we’re trying to maximize the count anyway.
Thus, we only need to care about consecutive segments of good indices.

Now, let’s fix the leftmost chosen good index i_1.
It’s not hard to see that the set of ‘valid’ rightmost good indices will itself form a contiguous range; so we only care about the largest among them.
This can be found using binary search along with a range query data structure to perform the check, giving \mathcal{O}(\log^2 N) time per left index.
For more detail on the implementation, see the author’s code below.

TIME COMPLEXITY:

\mathcal{O}(N\log^2 N) per testcase.

CODE:

Author's code (C++)

#pragma GCC optimize("Ofast")

#include <iostream>
#include <vector>
#include <array>
#include <string>
#include <algorithm>
#include <numeric>
#include <functional>
#include <cmath>
#include <limits>
#include <queue>
#include <stack>
#include <deque>
#include <set>
#include <map>
#include <unordered_set>
#include <unordered_map>
#include <iomanip>
#include <cassert>
#include <random>
#include <array>
#include <chrono>
using namespace std;

#define ll long long
#define nline "\n"
#define f first
#define s second
#define sz(x) (ll)x.size()
#define vl vector<ll>
#define all(x) x.begin(),x.end()
#define rall(x) x.rbegin(),x.rend() 
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
//--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------     
const ll MOD=998244353;
const ll MAX=100100; 
struct info{
  ll x=-1,y=-1,bad=0,freq=0;
};
 
class ST {
public:
    vector<info> segs;
    ll size = 0;
    info ID;
 
    ST (ll sz) {
        segs.assign(2 * sz, ID);
        size = sz;  
    }   
   
    info comb(info l,info r) {
        if(l.x==-1){
            return r;
        }
        if(r.x==-1){
            return l;
        }
        info cur;
        cur.x=l.x;
        cur.y=r.y;
        cur.bad=(l.bad or r.bad or (l.y > r.x));
        cur.freq=l.freq+r.freq;
        return cur;
    }
 
    void set_upd(ll idx, ll val) {
        info cur;
        cur.x=val;
        cur.y=val;
        cur.bad=0;
        cur.freq=(val!=-1);
        segs[idx += size] = cur;
        for(idx /= 2; idx; idx /= 2) segs[idx] = comb(segs[2 * idx], segs[2 * idx + 1]);
    }
 
    info query(ll l, ll r) {
        info lans = ID, rans = ID;
        for(l += size, r += size + 1; l < r; l /= 2, r /= 2) {
            if(l & 1) lans = comb(lans, segs[l++]);
            if(r & 1) rans = comb(segs[--r], rans);
        }
        return comb(lans, rans);
    }
}; 
struct FenwickTree{
    vector<ll> bit; 
    ll n;
    FenwickTree(ll n){
        this->n = n;
        bit.assign(n, 0);
    }
    FenwickTree(vector<ll> a):FenwickTree(a.size()){  
        ll x=a.size();
        for(size_t i=0;i<x;i++)
            add(i,a[i]);
    }
    ll sum(ll r) {
        ll ret=0;
        for(;r>=0;r=(r&(r+1))-1){
            ret+=bit[r];
        }
        return ret;
    }
    ll sum(ll l,ll r) {
        if(l>r)
            return 0;
        return sum(r)-sum(l-1);
    }
    void add(ll idx,ll delta) {
        for(;idx<n;idx=idx|(idx+1)){
            bit[idx]+=delta;
        }
    }
};
void solve(){
    ll n; cin>>n;
    ll len=2*n;
    vector<ll> a(len+5),l(n+5,-1),r(n+5,-1);
    for(ll i=1;i<=len;i++){
        cin>>a[i];
        if(l[a[i]]==-1){
            l[a[i]]=i;
        }
        else{
            r[a[i]]=i;
        }
    }
    vector<ll> b(len+5,-1);
    ST track(len+5);
    ll two=n,score=0;
    set<ll> consider;
    FenwickTree left_occ(len+5),right_occ(len+5),bruh(len+5);
    for(ll i=1;i<=n;i++){
        left_occ.add(l[i],1);
        right_occ.add(r[i],1);
        bruh.add(r[i],1);
    }
    ll lft=1,rght=2*n;
    for(ll i=len;i>=1;i--){
        ll val=a[i];
        if(r[val]==i){
            two--;
            track.set_upd(l[val],i);
            consider.insert(l[val]);
            bruh.add(i,-1);
            bruh.add(l[val],1);
            ll low=l[val],high=i-1,till=0;
            while(low<=high){
                ll mid=(low+high)/2;
                ll use=*(--consider.upper_bound(mid));
                auto now=track.query(1,use);
                ll valid=1;
                if(now.bad){
                    valid=0;
                }
                ll need=n-now.freq;
                if(bruh.sum(use+1,i-1)!=need){
                    valid=0;
                }
                if(valid){
                    till=mid;
                    low=mid+1;
                }
                else{
                    high=mid-1;
                }
            }
            if(till==0){
                continue;
            }
            till=*(--consider.upper_bound(till));
            ll freq=track.query(1,till).freq;
            score=max(score,freq);
        }
        else{
            break;
        }
    }
    cout<<n+score<<nline;
    return;
}
int main(){  
    ios_base::sync_with_stdio(false);                         
    cin.tie(NULL);                                    
    ll test_cases=1;
    cin>>test_cases;
    while(test_cases--){
      solve();
    }
    cerr<<"Time:"<<1000*((double)clock())/(double)CLOCKS_PER_SEC<<"ms\n"; 
}

ashwanth2003 · November 26, 2025, 4:51pm

This can also be solved in O(NlogN) with 2 pointers
submission: CodeChef: Practical coding for everyone

Almost the exact same idea as yours, but since only consecutive good indices matter, we shall do 2 pointers, just like a trivial problem of finding maximal good subsegment.

Say for a given good index L, we have [L....R] as maximal good subsegment, for this L. (i am iterating backwards in my 2 pointer approach)

To extend this for finding maximal good subsegment starting at L-1, we shall do this:

When observed, some new elements in the original array would have moved outside the first interval, I mean to say,the elements from (endpoint [goodidx[L-1]] to endpoint[goodidx[L]])
These newly moved out elements only would affect the good subsegment. (reduce the good subsegment size)
These newly moved out elements (not part of good indexes) corrsesponding mapped starting positions can be found out, and those good indexes < starting positions only satisfy.

ashwanth2003 · November 26, 2025, 6:31pm

This can be solved in O(N), Submission