ONECHANGE - Editorial

yahor_adm · December 5, 2021, 12:09pm

Let A=26 be the size of the alphabet.

Given two characters x,y (not necessarily distinct), a string has the form [x,y] if it is good and the first character is x and the last character is y.

Let us collect some simple observations:

In any optimal partition into good subsequences there are not two subsequences of the for [x,x] for the same character x.
Consider an optimal partition which contains a subsequence of the form [x,y] and a subsequence of the form [x,z]. We can replace those two subsequences with two subsequences on the same characters but with forms [y,y], [x,z] or [z,z], [x,y] (depending on whether the first occurrence of y comes before or after the first occurrence of z).
Consider an optimal partition which contains q subsequences with forms [x_1, x_2], [x_2,x_3],\dots, [x_{q-1}, x_q], [x_q, x_1] (in a cyclic fashion). Then such q subsequences can be replaced by q subsequences with forms [x_1,x_1], [x_2,x_2],\dots, [x_q, x_q].

From these three observations we can deduce that there is an optimal partition into good subsequences which consists of chains of subsequences with forms [x_1, x_2], [x_2, x_3], \ldots [x_{q-1}, x_q] so that two distint chains do not share any character. Notice that any such chain must cover all the occurrences of the characters x_1,x_2,\dots, x_q in the string s. We say that an ordered sequence of characters x_1, x_2, \dots, x_q is good if there exists a chain of subsequences [x_1,x_2],[x_2,x_3],\dots, [x_{q-1}, x_q] which covers all occurrences of such characters in the string s.

Let us now determine whether an ordered sequence of distinct characters x_1, x_2, \dots, x_q is good. We use a greedy approach. The subsequences [x_1,x_2] must contain all characters x_1 and it is convenient for it to contain all characters x_2 which come after any character x_1, then the subsequence [x_2,x_3] must contain all the remaining x_2 characters and so on. If the last subsequence [x_{q-1}, x_q] covers all the characters x_q in s then the sequence is good, otherwise it is not. In particular, let \alpha=\alpha(x_1, x_2,\dots, x_q) be the smallest occurrence of x_q not contained in [x_{q-1}, x_q]. Then \alpha is n if and only if the sequence is good. Notice that (this will be important later on) that \alpha(x_1, x_2,\dots, x_q) depends only on x_q and \alpha(x_1, x_2,\dots, x_{q-1}).

These observations are sufficient to produce an O(3^A) solution to the problem. Indeed, with a DP on subsets with complexity O(A2^A), for each subset of characters we can compute the largest \alpha value obtained by any of its permutations and then decide whether it can be ordered to be a good sequence (iff \alpha is n). Then the problem is equivalent to computing the largest number of disjoint subsets of of the alphabet which can form a good subsequence and this can be done in O(3^A) (for each subset we have to iterate over its subsets).

Let us now see how to speed up the solution.
For each subset of the characters S, let f(S) be the maximum pair (h, \beta) such that the subset S contains h+1 disjoint ordered sequences of characters so that:

The first h sequences are good (i.e., their \alpha value is n).
The last sequence has \alpha value equal to \beta.

The answer to the problem is given by A-f(\{1,2,\dots, A\}.first.

With a DP on subsets, we can compute f(S) for each subset S. Indeed, given S, the value of S can be computed by iterating on the characters x\in S and trying to improve f(S\setminus{x}) by adding x at the end of the partial chain. The complexity becomes O(A2^A) which is sufficient to get accepted because the time limit is huge.