DESTBRIDGE2 - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: raysh07
Tester: tabr
Editorialist: iceknight1093

DIFFICULTY:

TBD

PREREQUISITES:

Sorting

PROBLEM:

You’re given an array A of length N.
Consider a compete graph on N vertices, with the weight of the edge (i, j) being A_i\cdot A_j.
You can delete edges with a total cost of at most C.
Find the minimum possible size of the final connected component containing 1.

EXPLANATION:

Rather than starting out trying to decide which edges to delete, let’s instead look at the final state.
That is, let’s fix the set of vertices reachable from 1.
Suppose these are v_1, v_2, \ldots, v_k. (Note that one of them must be 1).
Let’s denote the other vertices as w_1, w_2, \ldots, w_{N-k}

Then, to ensure that only the v_i are reachable, we definitely need to delete every edge of the form (v_i, w_j) from the graph.
We also don’t really need to delete any other edges, which keeps the cost down.
So, the minimum cost to make exactly this set of vertices is reachable is

\sum_{i=1}^k \sum_{j=1}^{N-k} A_{v_i}\cdot A_{w_j}

Note that this can be simplified to

\left(\sum_{i=1}^k A_{v_i} \right) \cdot\left( \sum_{j=1}^{N-k} A_{w_j} \right)

Of course, trying every possible set of final vertices isn’t feasible, so we need to reduce the number of them that we check.

Let’s fix k, the size of the connected components.
If we are to choose k vertices, it can be seen that it’s optimal to choose the k of them with largest A_i values (more precisely, 1 and then k-1 of the remaining vertices with largest A_i values).

Why?

Let S_1 = A_{v_1} + A_{v_2} + \ldots + A_{v_k} and S_2 = A_{w_1} + A_{w_2} + \ldots + A_{w_{N-k}}.
The cost of choosing these vertices to be the final component is then S_1 \cdot S_2.
Note that S_1 + S_2 = sum(A) is a constant.

Now,

  • If S_1 \leq S_2, it’s better to discard every vertex other than 1i (and move them to the unchosen side).
    This will both reduce the cost and reduce the size of the connected component.
  • If S_1 \gt S_2, it’s optimal for S_1 to be as large as possible, since that’s what minimizes the product S_1\cdot S_2.
Proof

The proof of both facts above relies on simple algebra.
Consider two integers A and B, and an integer x\gt 0.
Then,

  • If A \leq B, we have (A-x)\cdot (B+x) = A\cdot B + Ax - Bx - x^2.
    This is \lt A\cdot B, since the difference is x\cdot (A-B) - x^2 and we have x\gt 0 and A\leq B.
    This proves our first case: when S_1 \lt S_2, it’s better to further lower S_1 as much as possible, and the lowest we can get is keeping vertex 1 alone.
  • If A \gt B, then similarly we have (A+x)\cdot (B-x) = A\cdot B + Bx - Ax - x^2.
    Again, B-A \lt 0 and x\gt 0 tell us that this is a lower cost; meaning it’s optimal for A to be as large as possible.
    This proves the second case: if we’re choosing k elements, we might as well pick the largest among them.

We’re consider the quantity (A+x)\cdot (B-x) because the sum of the numbers must remain a constant.


So, we either choose the singular vertex 1, or we choose 1 and k-1 of the largest remaining values.

This observation now gives us only N candidates to check.
For each candidate k, find the appropriate sums S_1 and S_2 (which is easy in linear time, assuming the array A has been sorted).
Then, if S_1 \cdot S_2 \leq C, it’s possible to end up with a component of size k, otherwise it isn’t.
Print the smallest k that satisfies this criterion, or -1 if none of them do.

Checking each candidate in linear time gives a solution in \mathcal{O}(N^2).
It’s also possible to solve the problem in linear time (after sorting) by utilizing the fact that moving from k to k+1 changes the sums S_1 and S_2 by only a single element each, so they don’t need to be recomputed in full.

TIME COMPLEXITY:

\mathcal{O}(N\log N) per testcase.

CODE:

Author's code (C++)
#include <bits/stdc++.h>
using namespace std;
#define int long long
#define INF (int)2e18

mt19937_64 RNG(chrono::steady_clock::now().time_since_epoch().count());

void Solve() 
{
    int n, c; cin >> n >> c;

    vector <int> a(n);
    for (auto &x : a) cin >> x;

    sort(a.begin() + 1, a.end(), greater<int>());
    int tot = accumulate(a.begin(), a.end(), 0LL);

    int sum = 0;
    for (int i = 0; i < n; i++){
        sum += a[i];
        if (sum != tot && sum > INF / (tot - sum)){
            continue;
        }
        if (sum * (tot - sum) <= c){
            cout << (i + 1) << "\n";
            break;
        }
    }
}

int32_t main() 
{
    auto begin = std::chrono::high_resolution_clock::now();
    ios_base::sync_with_stdio(0);
    cin.tie(0);
    int t = 1;
    // freopen("in",  "r", stdin);
    // freopen("out", "w", stdout);
    
    cin >> t;
    for(int i = 1; i <= t; i++) 
    {
        //cout << "Case #" << i << ": ";
        Solve();
    }
    auto end = std::chrono::high_resolution_clock::now();
    auto elapsed = std::chrono::duration_cast<std::chrono::nanoseconds>(end - begin);
    cerr << "Time measured: " << elapsed.count() * 1e-9 << " seconds.\n"; 
    return 0;
}
Editorialist's code (Python)
for _ in range(int(input())):
    n, c = map(int, input().split())
    a = list(map(int, input().split()))
    a[1:] = sorted(a[1:])[::-1]
    ans = n
    for i in range(n-1):
        c1 = sum(a[:i+1])
        c2 = sum(a[i+1:])
        if c1*c2 <= c: ans = min(ans, i+1)
    print(ans)
3 Likes
 N = 6  , C = 1275 
35 15 10 25 10 5
``` , when above is the test case , how the answer is 4 . anyone please explain !!!
3 Likes

@iceknight1093 Well N was kept 100 It really would have confused many people to think of something in N^2.

2 Likes

yeah, i also think that the answer shld be “5” . can anyone please explain whether am i wrong or not and if i am how so?

1 Like

Bro you can take a 10 and a 5 to get a total of exact 1275

Edit: You must be removing the edge from 10 to 5 as well which is not needed so that total cost will be a this term less. Let them be connected b/w themselves you just got to disconnect it from 1.

1 Like

The input gives the following graph:

1 2 525
1 3 350
1 4 875
1 5 350
1 6 175
2 3 150
2 4 375
2 5 150
2 6 75
3 4 250
3 5 100
3 6 50
4 5 250
4 6 75
5 6 50

You can get rid of the nodes 5 and 6 by removing the following edges:

1 5 350
1 6 175

2 5 150
2 6 75

3 5 100
3 6 50

4 5 250
4 6 75

The total cost doesn’t exceed 1275. Hence, the answer is 4.

2 Likes

Maybe this should be a lesson not to guess the complexity from the constraints then :slight_smile:

Constraints were kept low because the hard part of the problem was to notice that there are only \mathcal{O}(N) candidates for which set to check, so slower implementations that compute the cost for each of them in linear (or even quadratic) time were allowed to pass.

2 Likes

Haa , got it . I wasn’t checked that . Thanks !