COLARR7 - Editorial

iceknight1093 · June 18, 2025, 4:30pm

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: raysh07
Tester: sushil2006
Editorialist: iceknight1093

DIFFICULTY:

Easy

PREREQUISITES:

DSU/DFS/BFS

PROBLEM:

You’re given an array A of length 2N, with elements between 1 and N.
Find the minimum number of changes needed to make it satisfy the following:

1 \leq A_i \leq N
A_{2i-1} = A_{2i}
A_{2i} \neq A_{2j} for i \neq j

EXPLANATION:

The given conditions essentially say that the final array must consist of N pairs of adjacent equal elements; with each integer from 1 to N being part of a pair.

Let’s separate the initial array into its N index pairs (with indices 2i-1 and 2i forming a pair), and see what needs to change.
For each pair, we have a few possibilities.

A_{2i-1} = A_{2i}.
Here, we can either leave this pair untouched (which costs 0 operations), or we must change both elements to something else (which costs 2 operations).
Obviously, the first is preferred as much as possible.
A_{2i-1} \neq A_{2i}.
In this case, there are a couple more options: we can turn A_{2i-1} to A_{2i} (which costs one operation), or turn A_{2i} into A_{2i-1} (also costs one operation), or turn both A_{2i-1} and A_{2i} into different elements (which costs 2 operations).
So we always need at least one operation, and sometimes might need 2.

Consider some element x; suppose there are p_x pairs of indices (2i-1, 2i) such that both elements are x.
When p_x \gt 0, we can only have one of these pairs remain with both elements being x: all the other pairs must be changed to something other than x.
It doesn’t really matter what the other pairs are changed to; so we can just keep these extra pairs aside and use them up in the end however we need to - the cost will remain the same.

That leaves us with only needing to solve for pairs with unequal elements.
Recall that each of these has a cost of at least 1, but sometimes might need 2.
We need to figure out how many of them can be given a cost of 1.

Note that we’re now dealing with several pairs of elements: and in many such cases, it’s useful to think of these pairs as being edges of a graph.
So, we do exactly that: consider a graph on the vertex set \{1, 2, \ldots, N\} such that for each pair (x, y) with us, we add the undirected edge (x, y) to the graph.

We now need to bring in the costs to this model.
That can be done by assigning directions to the edges.
Specifically, for each pair (and hence edge) (x, y), we have three options:

Convert x to y.
This can be thought of as directing the edge as x \to y.
Convert y to x.
This can be thought of as directing the edge as y \to x.
Convert both x and y to a different value.
This can be thought of as deleting the edge entirely.

Our goal is hence to direct as many of the edges as possible (while deleting the rest).
We have a couple of constraints however:

No vertex can have two (or more) edges directed into it; because each element from 1 to N will appear in exactly one pair in the end.
If p_y \gt 0 (so that at least one (y, y) pair exists), there’s no use in directing an edge as x \to y, because doing so would create two separate (y, y) pairs instead which is not what we want.

From the above graph construction, it’s obvious that connected components of the graph are entirely independent of each other; so we can solve for them separately.

Consider any one connected component.
Suppose it has V vertices and E edges.
Our goal is to direct as many of the edges as possible, while also ensuring that every vertex has indegree \leq 1.

Clearly, this means that no more than V edges can be directed (at best, one per vertex).
The next step should of course be to figure out when V edges can be directed.

For this to happen, we must have \geq V edges in this component in the first place: meaning the component contains a cycle.
Using this cycle, it turns out we can always direct V edges (if there are multiple cycles, choose any of them).
A simple construction is as follows: direct the edges of the cycle along a single direction of the cycle itself so all of them are taken care of; then start a DFS/BFS from the cycle to visit all the other vertices. Whenever an edge gives you a new vertex, direct that edge to the new vertex.
This ensures that you end up with all vertices having indegree 1, as required.

Note that the true number of edges directed this way should be V - K instead; where K is the number of vertices u in the component with p_u \gt 0.
This is because, as noted earlier, these vertices shouldn’t have any incoming edges; and our construction gives one incoming edge to every vertex so we can just delete the edges corresponding to these ones.

Next, what happens if we don’t have at least V edges?
Well, in that case we must have V-1 edges exactly, i.e. the connected component is a tree.

It’s now pretty easy to see that all V-1 edges can be directed: just choose any vertex as the root and direct every edge away from it.
Of course, once again here we need to consider what happens to “forbidden” vertices, i.e. those u with p_u \gt 0.
It’s not hard to see that, if there are K forbidden vertices,

If K = 0, all V-1 edges will be directed.
If K \gt 0, V-K edges can be directed; this is doable by rooting the tree at a forbidden vertex.

In summary, for a connected component of the graph that has V vertices, E edges, and K forbidden vertices,

If K \gt 0, then V-K of the edges can be directed.
If K = 0, then V edges can be directed if E \geq V, and V-1 edges can be directed otherwise.

Summing this up across all components gives the maximum number of pairs that require cost 1.
We’ve also found the number of pairs that require cost 0, namely the number of positive elements of the p array.
That means every remaining pair needs cost 2; and computing their count is trivial since we know the number of cost 0 and 1 pairs.
This gives us the final answer.

We only needed information about each connected component: its number of vertices and forbidden vertices, as well as whether it’s a tree or not.
This information can be found by performing a DFS/BFS, or using a DSU, all of which are introductory graph algorithms.

TIME COMPLEXITY:

\mathcal{O}(N) or \mathcal{O}(N\alpha(N)) per testcase.

CODE:

Editorialist's code (PyPy3)

# https://github.com/cheran-senthil/PyRival/blob/master/pyrival/data_structures/DisjointSetUnion.py
class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))

    def find(self, a):
        acopy = a
        while a != self.parent[a]:
            a = self.parent[a]
        while acopy != a:
            self.parent[acopy], acopy = a, self.parent[acopy]
        return a

    def union(self, a, b):
        self.parent[self.find(b)] = self.find(a)
        
for _ in range(int(input())):
    n = int(input())
    a = list(map(int, input().split()))
    pairs = [0]*(n+1)
    for i in range(0, 2*n, 2):
        if a[i] == a[i+1]: pairs[a[i]] += 1
    
    dsu = UnionFind(n+1)
    for i in range(0, 2*n, 2):
        if a[i] != a[i+1]:
            dsu.union(a[i], a[i+1])
    
    vertices, edges, have = [0]*(n+1), [0]*(n+1), [0]*(n+1)
    for i in range(1, n+1):
        vertices[dsu.find(i)] += 1
        if pairs[i] > 0: have[dsu.find(i)] += 1
    for i in range(0, 2*n, 2):
        if a[i] != a[i+1]:
            edges[dsu.find(a[i])] += 1
    
    ans = 0
    for i in range(1, n+1):
        if vertices[i] == 0: continue

        if have[i] > 0: ans += vertices[i] - have[i]
        else:
            if edges[i] >= vertices[i]: ans += vertices[i]
            else: ans += vertices[i] - 1
    rem = n - ans
    for i in range(1, n+1):
        if pairs[i] > 0: rem -= 1
    print(ans + 2*rem)