GSEQ - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Authors: still_me, iceknight1093
Testers: the_hyp0cr1t3, rivalq
Editorialist: iceknight1093

DIFFICULTY:

TBD

PREREQUISITES:

Prefix sums, dynamic programming

PROBLEM:

Given a binary array A, find the longest good sequence x_1, x_2, \ldots, x_k such that it satisfies:

  • 1 \leq x_1 \lt x_2 \lt \ldots \lt x_k \leq N+1
  • A[x_i:x_j-1] contains an equal number of zeros and ones for each i \lt j.

EXPLANATION:

First, note that it’s enough to ensure that A[x_i:x_{i+1}-1] (i.e, the subarray between consecutive elements) satisfies the condition, i.e, contains an extra one.

Proof

This is not hard to see.
The subarray A[x_i:x_j-1] can be thought of as the concatenation of A[x_i:x_{i+1}-1], A[x_{i+1}:x_{i+2}-1], \ldots, A[x_{j-1}, x_j-1]. Note that there are j-i such subarrays.

If each of these contains an extra one, of course all together the number of extra ones will be exactly j-i.

Now, let’s use a small trick: replace every 0 in the array with -1. Let’s call the array obtained this way as B.
For example, if A = [0, 1, 1, 0, 1] then B = [-1, 1, 1, -1, 1].
Notice that "A[L:R] contains an extra one" translates to "B[L:R] has a sum of 1" in the new array, which is much easier to deal with!

Since we’re dealing with subarray sums, it’s natural to think of prefix sums.
So, let P_i = B_1 + B_2 + \ldots + B_i (with P_0 = 0) denote the prefix sum array of B.

Now, for B[L:R] to have a sum of 1, we must have P_R - P_{L-1} = 1, or P_R = P_{L-1} + 1.

Applying this to the definition of a good sequence, we see that we must have P_{x_{i+1}-1} = P_{x_i-1} + 1 for every 1 \leq i \lt k.
In other words, we need to choose a sequence of indices whose prefix sums are k, k+1, k+2, \ldots for some value of k.

The longest good sequence thus corresponds to the longest sequence of prefix sums of this form.

This can be computed using dynamic programming.

Let dp_i denote the length of the longest sequence of prefix sums ending at position i that satisfies the above condition.
Transitions for a given index are quite easy to compute in \mathcal{O}(N). We simply have dp_i = 1 + \max (dp_j) across all j \lt i such that P_j+1 = P_i.

However, doing this \mathcal{O}(N) computation for each index is too slow, we need to speed it up a bit.
This can be done by noting that we only care about those indices whose P_j values equal P_i - 1; in fact, we only care about the maximum dp_j value among these indices.

So, let’s keep another array \text{mx}, where \text{mx}[x] is the largest value of dp_i across all indices i such that P_i = x. Initially, initialize it to all zeros.
Then, for each i from 0 to N,

  • dp_i = 1 + \text{mx}[P_i]
  • \text{mx}[P_i] = \max(dp_i, \text{mx}[P_i])

The maximum element of dp is the final answer.
Reconstructing a valid sequence is not hard, when performing the dp keep a link to the previous element in the sequence, then follow these in reverse at the end.

TIME COMPLEXITY:

\mathcal{O}(N) per testcase.

CODE:

Editorialist's code (C++)
#include "bits/stdc++.h"
// #pragma GCC optimize("O3,unroll-loops")
// #pragma GCC target("avx2,bmi,bmi2,lzcnt,popcnt")
using namespace std;
using ll = long long int;
mt19937_64 rng(chrono::high_resolution_clock::now().time_since_epoch().count());

int main()
{
	ios::sync_with_stdio(false); cin.tie(0);

	int t; cin >> t;
	while (t--) {
		int n; cin >> n;
		vector<int> a(n);
		for (int i = 0; i < n; ++i) cin >> a[i];
		vector<int> difs = {0};
		for (int i = 0; i < n; ++i) {
			if (a[i] == 1) difs.push_back(difs[i] + 1);
			else difs.push_back(difs[i] - 1);
		}
		
		map<int, int> len, last;
		vector<int> link(n+1, -1), ending_at(n+1);
		for (int i = 0; i <= n; ++i) {
			int cur = len[difs[i]-1] + 1;
			ending_at[i] = cur;
			if (cur > 1) link[i] = last[difs[i] - 1];
			
			len[difs[i]] = cur;
			last[difs[i]] = i;
		}
		int ans = *max_element(begin(ending_at), end(ending_at));
		cout << ans << '\n';
		for (int i = n; i >= 0; --i) {
			if (ending_at[i] != ans) continue;
			vector<int> pos;
			int cur = i;
			while (1) {
				pos.push_back(cur);
				if (link[cur] == -1) break;
				cur = link[cur];
			}
			reverse(begin(pos), end(pos));
			for (int x : pos) cout << x+1 << ' ';
			cout << '\n';
			break;
		}
	}
}

Wow. A very nice problem. At first glance it seemed so hard and complicated, but after some observation, it was solvable.

I’ll be honest I think that the problem statement in this question was extremely unclear. It should’ve said “output an array of indexes” but it just said output an array “A”. Like what does that mean???

After reading the editorial I managed to understand how to solve it. I couldn’t solve it though :’(

can you tell me how to develop intuition for these type of problems

By practicing to solve on your own.

1 Like