LMP8H - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: kingmessi
Tester: pols_agyi_pols
Editorialist: iceknight1093

DIFFICULTY:

Medium

PREREQUISITES:

Prefix sums, sorting

PROBLEM:

For binary strings P, Q, R, let f(P, Q, R) denote the minimum cost of making Q = R using the following operations:

  • First, flip any single character in either P or Q, with a cost of 1.
  • After all flips, you may swap Q_i and Q_j, but only if P_i = P_j.

Given binary strings A, B, C, compute

\sum_{L=1}^N \sum_{R=L}^N f(A[L, R], B[L, R], C[L, R])

EXPLANATION:

From the easy version, we have the following solution to a single triple (A, B, C):

  • Compute x_1, x_2, x_3, x_4 to be the number of indices with A_iB_iC_i pattern 001, 010, 101, 110 respectively.
  • Then, repeatedly reduce values by 1 till x_1+x_3 = x_2+x_4 is reached.
    These reductions must be done in such a way that the value |x_1 - x_2| is made as small as possible.
  • In the end, add |x_1 - x_2| to the answer.

First, let’s reduce the solution to the easy version a bit.
After knowing the values x_1, x_2, x_3, x_4 it’s actually possible to solve the problem in constant time.

To see how, let’s look at what exactly we’re doing.
First, we perform |(x_1 + x_3) - (x_2 + x_4)| subtractions, and then we use some transfers if needed.
Suppose x_1+x_3 \ge x_2 + x_4. Then, it can be seen that:

  • If x_1-x_2 \ge 0 and x_3-x_4 \ge 0, the total number of operations we perform is exactly (x_1-x_2) + (x_3-x_4).
    This is because we need this many subtraction operations to make the odd-even difference 0, and no more transfer operations are needed.
  • If x_1 - x_2 \ge 0 and x_3 - x_4 \le 0, the total number of operations we’ll perform is exactly
    (x_1 - x_2).
    This is because we’ll perform (x_1-x_2) + (x_3-x_4) subtractions at which point the differences become equal (and equal to x_4 - x_3), so the transfer operations will cancel the second term.
  • If x_1 - x_2 \le 0 and x_3 - x_4 \ge 0 instead, similar reasoning shows that we perform (x_3 - x_4) operations instead.

Putting the cases together, we obtain the single expression \max(0, x_1-x_2) + \max(0, x_3-x_4).
Dealing with \max(\ldots) is a bit awkward, so we instead convert it to pure arithmetic by noting that \max(a, b) = \frac{a+b}{2} + \frac{|a-b|}{2}.

This gives us the expression

\frac{x_1-x_2}{2} + \frac{x_3-x_4}{2} + \frac{|x_1-x_2|}{2} + \frac{|x_3-x_4|}{2}

for this case.

For the opposite case of x_1+x_3 \le x_2 + x_4, we instead obtain the expression

\frac{x_2-x_1}{2} + \frac{x_4-x_3}{2} + \frac{|x_1-x_2|}{2} + \frac{|x_3-x_4|}{2}

Both cases’ formulas can be combined to obtain

\frac{|x_1-x_2+x_3-x_4|}{2} + \frac{|x_1-x_2|}{2} + \frac{|x_3-x_4|}{2}

which needs no casework at all (other than the absolute value, technically).


With this expression in hand, we can now move to computing the sum of answers across all ranges.

The expression has three separate parts, so we’ll work on each of them separately.
Let’s just compute the sum of |x_1 - x_2| across all ranges (we can divide by 2 at the end).

Define P_i to be the sum of (x_1 - x_2) (importantly, we aren’t taking absolute value yet) for the prefix of length i.
This is easy to compute: initialize P_i to P_{i-1}, and then add 1 if this index is 001 or subtract 1 if it’s 010 (and do nothing if it’s neither).

Once the P array is known, the value for the range [l, r] equals |P_{r} - P_{l-1}|.
So, we want to compute

\sum_{l=1}^N \sum_{r=l}^N |P_r - P_{l-1}|

To do this easily, observe that rearranging the array P doesn’t change the answer (since we’re effectively just adding up all pairwise differences of P anyway).
So, we can just sort the array P, and the expression becomes

\sum_{l=1}^N \sum_{r=l}^N (P_r - P_{l-1})

(i.e. the absolute value of the difference is removed).

This is easy to compute in \mathcal{O}(N) time by looking at contributions: the element P_i is on the addition side for i indices, and on the subtraction side for N-i.
So, its overall contribution is P_i \cdot (2i - N).
Adding this up for all i gets us the value we need.

The exact same thing can be done to compute the sum of |x_3 - x_4| and the sum of |x_1+x_3-x_2-x_4| across all subarrays, completing the problem.

TIME COMPLEXITY:

\mathcal{O}(N\log N) per testcase.

CODE:

Editorialist's code (PyPy3)
for _ in range(int(input())):
    n = int(input())
    a = input()
    b = input()
    c = input()
    
    pref = [ [0 for _ in range(n+1)] for _ in range(3) ]
    for i in range(n):
        for j in range(3):
            pref[j][i+1] = pref[j][i]
        
        if b[i] == c[i]: continue
        if a[i] == '0':
            if b[i] == '0':
                pref[0][i+1] += 1
                pref[2][i+1] += 1
            else:
                pref[0][i+1] -= 1
                pref[2][i+1] -= 1
        else:
            if b[i] == '0':
                pref[1][i+1] += 1
                pref[2][i+1] += 1
            else:
                pref[1][i+1] -= 1
                pref[2][i+1] -= 1
    
    ans = 0
    for j in range(3):
        pref[j].sort()
        for i in range(n+1):
            ans += pref[j][i] * (2*i - n)
    print(ans//2)
1 Like