MXMNSSUM - Editorial

infinitepro · September 30, 2021, 10:59pm

PROBLEM LINK:

Contest - Division 3
Contest - Division 2
Contest - Division 1

DIFFICULTY:

MEDIUM

PREREQUISITES:

Dynamic Programming, Binary Search, Range Minimum Queries, Coordinate Compression

PROBLEM:

Given array A and integer K. Determine the largest value s such that, it is possible to partition A into K contiguous blocks, where the sum of elements of each block is \ge s.

EXPLANATION:

Let s be the largest integer such that it is possible to partition A into K contiguous blocks, where the sum of elements of each block is \ge s. This value is the answer we are looking for.

Step 1: Use binary searching to find the value of s.

Why? How?

Let solve(d) represent whether it is possible to partition A into K blocks, where the sum of elements of each block is \ge d. Let solve(d) equal 1 if it’s possible to partition, and 0 otherwise.

Observation 1: If solve(d)=1, then solve(d')=1 for all d\ge d'.
(The proof is trivial and left to the reader as an exercise)

Observation 2: If solve(d)=0, then solve(d')=0 for all d'\ge d.

Proof

We prove by contradiction.
Say there exists d'\ge d such that solve(d)=0 and solve(d')=1.

By the previous observation, solve(d')=1 implies solve(x)=1 for all d'\ge x. But solve(d)=0 - a contradiction and we are done.

The two observations imply the monotonic structure of the function solve, sufficiently allowing us to binary search to find the answer.

Let d be the currently considered value in the binary search. Now, we want to check if it’s possible to partition A into K blocks, each whose sum \ge d.

Definition: Call an array of elements good, if the sum of its elements \ge d.

Let dp_i be the set of all positive integers k, such that it is possible to partition the first i elements of A into k good subarrays. I claim that dp_i is a (possibly empty) set of consecutive integers, for all i.

Proof

Since the proof has a lot in common with the logic of the solution, reading the rest of the editorial first will make the proof much more clearer.

Let pref(x)=A_1+A_2+\dots+A_x.

Computing dp_i is straightforward, and can be described mathematically as:

dp_i = \{x+1\mid x\in dp_j \space\forall\space j<i\text{ and } pref(j)\le pref(i)-d\}

Let’s define G_i[v] as the set of integers x, such that it is possible to partition the first j elements of A into x good subarrays, where j<i and pref(j)=v.

G_{i+1}[v] = \begin{cases} G_i[v]\cup dp_i, &\text{if } pref(i)=v\\ G_i[v], &\text{otherwise} \end{cases}

This recurrence is derived from a simple (yet important!) observation, the proof of which is left to the reader as an exercise.

Finally (the last definition, bear with me), let’s define H_i[v] = G_i[v]\cup H_i[v-1]. That is, H_i[v] is the union of all sets G_i[v'] with v'\le v. It is easy to deduce that H_i[v'] is always subset of H_i[v].

Informally, H_i[v] can be described as the set of integers x, such that it is possible to partition the first j elements of A into x good subarrays, where j<i and pref(j)\le v.

Then, we can rewrite the set relation of dp_i (mentioned at the start) as:

dp_i=\{x+1\mid x\in H_i[pref(i)-d]\}

Thus, if we prove that H_i[v] is a (possibly empty) set of consecutive integers for all v, then dp_i is also a (possibly empty) set of consecutive integers, completing the proof.

We prove by induction.
Assume for a particular n, H_n[v] is a set of consecutive integers, for all v. Let us prove by inducting on n.

\begin{aligned} H_{n+1}[v]&=G_{n+1}[v]\cup G_{n+1}[v-1]\cup\dots\\ &=\begin{cases} (G_n[v]\cup G_n[v-1]\cup\dots)\cup dp_n, &\text{if } pref(n)\le v\\ (G_n[v]\cup G_n[v-1]\cup\dots), &\text{otherwise} \end{cases} \\\\ &=\begin{cases} H_n[v]\cup dp_n, &\text{if } pref(n)\le v\\ H_n[v], &\text{otherwise} \end{cases} \end{aligned}

H_{n+1}[v] = H_n[v] implies our proposition holds for H_{n+1}[v] (since by our assumption, it holds for H_n[v]). So, lets only focus on the former case, which can be expanded as:

H_{n+1}[v] = H_n[v]\cup \{x+1\mid x\in H_n[pref(n)-d]\}

One of H_{n+1}[v], H_n[pref(n)-d] is a subset of the other. A bit of case work shows that, the union of two overlapping sets of consecutive integers, after shifting all values of one set by 1, is still a set of consecutive integers. Therefore, the proposition holds true for H_{n+1}[v], establishing the inductive step.

The trivial case of n=0 has H_0[v]=\varnothing for all v, which matches our proposition.

Thus proved.

Why is this claim important? The claim implies that the set of values of dp_i is made up of all integers in some range [L_i,R_i]. Then, A can partitioned into K good subarrays only if L_N\le K\le R_N.

All that remains now is to calculate L_i and R_i efficiently.

Suboptimal solution

A straightforward DP recurrence relation gives:

L_i = 1+\min L_j\\ R_i = 1+\max R_j

where j<i and \sum_{x=j+1}^i A_x \ge d.

This can be calculated in O(N^2), enough for subtask 2, but not viable for the original constraints.

Optimised solution

Let pref(x)=A_1+A_2+\dots+A_x.

The suboptimal solution can be rewritten as:

L_i = 1+\min L_j\\ R_i = 1+\max R_j

where j<i and pref(i)-pref(j) \ge d\implies pref(j)\le pref(i)-d.

Let mini_x represent the set of all L_j such that, j<i and pref(j)=x. Similarly, let maxi_x represent the set of all R_j such that, j<i and pref(j)=x.
Then, from the above stated recurrence constraint, it is easy to see

\begin{aligned} L_i &= 1+\min_{x\le pref(i)-d}(\min mini_x)\\ R_i &= 1+\max_{x\le pref(i)-d}(\max maxi_x) \end{aligned}

This can be done easily using range minimum queries on mini and maxi, typically using a Segment Tree/Fenwick Tree with coordinate compression (since the range of x in mini_x/maxi_x is [-1e14, +1e14]).

The time complexity of this approach is O(N\log N), which is efficient enough to AC for the given constraints!

To summarise, binary search for the answer - let the midpoint value we are currently trying be d. For this particular d, calculate [L_i,R_i] for all i, in subquadratic time. Then, A can be partitioned into K subarrays, each with sum of elements \ge d, only if L_N\le K\le R_N. Return the result and continue the binary search until it converges.

TIME COMPLEXITY:

Binary searching for the answer takes O(\log S), where S is the size of the answer space (S\le 2\times10^{14}). Calculating [L_i, R_i] for each d (using the memory efficient method) takes O(N\log N).
The total time complexity is therefore

O(N\log N\log S)

per test case.

SOLUTIONS:

Editorialist’s solution can be found here.
Author’s solution can be found here.
Tester’s solution can be found here.

Experimental: For evaluation purposes, please rate the editorial (1 being poor and 5 excellent)

1
2
3
4
5

0 voters

granadierfc · July 15, 2023, 5:43am

well you don’t need to maintain L values. it suffices to know R values , cuz if R >= 1 , then L = 1 always.