COOLCHECK - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: proelectro
Editorialist: iceknight1093

DIFFICULTY:

Medium

PREREQUISITES:

None

PROBLEM:

You’re given two arrays A and B.
A subsequence of A is called legal if its average exists in B.

Find any subsequence of A that’s not legal.

EXPLANATION:

To get an idea of what’s going on, let’s look at subsequences of A in increasing order of their size.

First, we have subsequences of length 1 - namely the elements of A themselves. If any of these are not in B, we already have an illegal subsequence; otherwise we must look further.

Next, let’s look at subsequences of length 2 - for each (i, j), we want \frac{A_i + A_j}{2} to be present in B.
In particular, if A_i + A_j is odd, then \frac{A_i + A_j}{2} is not even an integer, and so certainly won’t be in B.
The sum of two numbers is odd if and only if one of them is odd and the other is even - so if A contains both even and odd values, we have an answer.

Otherwise, all elements of A have the same parity - without loss of generality, say they’re all even.
All pairwise averages are now integers, so we need to check if each of them belongs to B or not. This is a bit hard to do immediately, so let’s instead look at other subsequence sizes.

Looking at subsequences of size 3, let’s go back to what we first noticed about length 2: if we’re able to find a triple (A_i, A_j, A_k) such that A_i + A_j + A_k is not a multiple of 3, then their average won’t even be an integer and so certainly won’t be in B.
This leads to the question: when exactly can all sums of three elements be a multiple of 3?

Answer

All triplet sums of an array (of length \gt 3) can be divisible by 3 if and only if all the elements of the array are equal modulo 3, i.e. A_i \equiv A_j \pmod 3 for all i, j.

The sufficiency of this is obvious, and necessity can be proved as follows:
Suppose there are two elements A_i and A_j such that A_i \not\equiv A_j \pmod 3.
Take any two elements of the array other than these two. Then, keeping either A_i or A_j as the third element will lead to a sum that’s not a multiple of 3.

In fact, it’s not hard to see that the above argument works for not just 3, but any integer k (1 \lt k \lt N).
That is, all k-subsets can have integer averages if and only if all the elements are equal modulo k.
Note that k = N is an exception here, because we’re forced to take all N elements. But there’s only one such subset so it’s easy to check for.


Let’s go back to the original condition: all elements must be equal modulo k, otherwise we can find an illegal subset of size k.
In particular, note that if this condition holds for k = 2, 3, 4, \ldots, N-1, then all the array elements must be equal modulo 2, 3, 4, \ldots, N-1.
This is equivalent to all the elements being equal modulo \text{lcm}(2, 3, \ldots, N-1) (this follows from the Chinese Remainder Theorem).

However, \text{lcm}(2, 3, \ldots, N-1) grows extremely quickly as N gets larger.
In particular, for N \geq 18, this value exceeds 10^7 which is the largest value with us.
So, for N \geq 18, the only way elements can be equal modulo the LCM, is if they’re equal in the first place!

This observation, along with 2^{18} being small, gives us a pretty simple solution:

  • If N \geq 18, check if all the elements are equal. There are now two possibilities:
    1. All the elements are indeed equal, say to x. Then, the average of any subsequence is just going to be x, so if x \in B every subsequence will be legal, otherwise no subsequence is legal.
    2. Not all elements are equal. Then, take any 18 elements such that not all of them are equal, and simply bruteforce over all subsets of these 18 elements to find an illegal subset (which we know exists, from the earlier discussion).
  • If N \lt 17, the situation is even simpler: you can directly bruteforce across all subsets of A and check if the average is an integer that belongs to B.

Checking whether a given average lies in B can be done quickly by sorting B and binary searching on it, for instance.

The time complexity of this approach is \mathcal{O}(M\log M + 2^{\min(N, 18)}(\min(N, 18) + \log M)) per test which is fast enough for the given constraints (especially since there are no more than 100 tests in each file).

There are ways of implementing this faster (in particular, the entire (\min(N, 18) + \log M) term can be made \mathcal{O}(1)) but they shouldn’t be necessary to get AC.

TIME COMPLEXITY:

\mathcal{O}(M\log M + 2^{\min(N, 18)}(\min(N, 18) + \log M)) per testcase.

CODE:

Author's code (PyPy3)
import sys
input = sys.stdin.readline
 
t = int(input())
 
def ctz(x):
    c = 0
    while x & 1 == 0:
        x >>= 1
        c += 1
    return c
 
def popcount(x):
    c = 0
    while x:
        c += 1
        x &= x - 1
    return c
 
for _ in range(t):
    n, m = map(int, input().split())
    a = list(map(int, input().split()))
    b = list(map(str, input().split()))
    b = set(b)
    while len(a) > 18 and a[-1] == a[0]:
        a.pop()
    if len(a) > 18:
        a[17] = a[-1]
    a = a[:18]
    n = len(a)
    
    ans = 0    
    valid = True
    dp = [0] * (1 << n)
    for j in range(1, 1 << n):
        s = 0
        i = ctz(j)
        c = popcount(j)
        s = dp[j] = dp[j ^ (1 << i)] + a[i]
        if s % c != 0:
            ans = j
            valid = False
            break
        avg = s // c
        if str(avg) not in b:
            ans = j
            valid = False
            break
    if valid:
        print(-1)
    else:
        print(popcount(ans))
        for i in range(n):
            if ans & (1 << i):
                print(a[i], end = ' ')

Editorialist's code (C++)
// #include <bits/allocator.h>
// #pragma GCC optimize("O3,unroll-loops")
// #pragma GCC target("avx2,bmi,bmi2,lzcnt,popcnt")
#include "bits/stdc++.h"
using namespace std;
using ll = long long int;
mt19937_64 RNG(chrono::high_resolution_clock::now().time_since_epoch().count());

int main()
{
    ios::sync_with_stdio(false); cin.tie(0);

    int t; cin >> t;
    while (t--) {
        int n, m; cin >> n >> m;
        vector<int> a(n), b(m);
        for (int &x : a) cin >> x;
        for (int &x : b) cin >> x;

        ranges::sort(b);
        vector<int> ind;
        for (int i = 0; i < min(n, 18); ++i) ind.push_back(i);
        for (int i = 18; i < n; ++i) {
            if (a[i] != a[0]) {
                ind.pop_back();
                ind.push_back(i);
                break;
            }
        }
        n = size(ind);
        
        int ans = -1;
        for (int mask = 1; mask < 1 << n; ++mask) {
            int s = 0, len = 0;
            for (int i = 0; i < n; ++i) {
                if (mask & (1 << i)) {
                    s += a[ind[i]];
                    ++len;
                }
            }

            if (s%len == 0) {
                int av = s / len;
                if (ranges::binary_search(b, av)) continue;
            }
            ans = mask;
            break;
        }

        if (ans == -1) cout << -1 << '\n';
        else {
            cout << __builtin_popcount(ans) << '\n';
            for (int i = 0; i < n; ++i) if (ans & (1 << i))
                cout << a[ind[i]] << ' ';
            cout << '\n';
        }
    }
}

What was test 30? For some reason, I kept failing on it.

I also kept failing on 2. It should have been explicitly mentioned that the order of the subsequence cannot be changed. I lost 30 minutes because of it.

when exactly can all sums of three elements be a multiple of 3?

  • The condition you gave for this is incorrect it seems to me.
  • Counter example: 3 4 5
  • mod values: 0 1 2
  • sum of these numbers is divisible by 3 but according to the logic you gave, it won’t be considered.
    @ iceknight1093
1 Like

Here’s my (somewhat unproven) solution that gets AC:

  1. Randomly shuffle A.
  2. For every prefix of A, check if it is a valid subsequence.
  3. Repeat steps 1 and 2 for a total of 500 times.

Submission Link

During the contest, I thought of the intended solution but it looked like too much work, especially since I was convinced that this approach was correct.

2 Likes

Same here, just I tried 5000 times with a random prefix. 500 got WA.

test 30
1 1 … 1 26

Definition subseq: A subsequence of a given sequence is a sequence that can be derived from the given sequence by deleting some or no elements without changing the order of the remaining elements.

length > 3

1 Like

If it’s possible, could you share the task 24, my solution gets WA. My solution is the same as yours, however, in the case where N >= 18, instead of searching over all subsets I search for all possible sizes of a subsequence <= 20 and try to find a sum which is not divisible by its size.

Thank you for sharing the test. It appears I had a weird bug.

I understand the defintion of a subsequence :wink:
Usually (on Codeforces), the statement is written in the following format: output indices i1<i2<…<ik of the subsequence you choose. The part where the indices must be increasing is in bold. I believe it makes writing all of the following easier:

  • Checker

  • Cpp solution

  • Statement (in terms of clarity)

1 Like

whats the logic behind this algo to work ?

If we modify the problem to this- the average needs to be an integer- and we remove the constraint that the average needs to belong to B:

Let a_1 \equiv a_2 \equiv a_3 = \dots \equiv a_{n-1} \not\equiv a_n \pmod k for some 1 \lt k \lt n
and a_1 \equiv a_2 \equiv a_3 = \dots \equiv a_{n-1} \equiv a_n \pmod l for each 1 \le l \lt n, l \neq k

Then any a k-length subsequence is not legal iff a_n is part of this subsequence. Thus the probability of a randomly selected k-length subsequence not being legal is \frac{k}{n}.

For 500 subsequences, you get 1 - \big (1 - \frac{k}{n} \big ) ^ {500} as the probability of success.

So to minimize this value, we get k=2 and n=18, I guess…
This gives a probability of 1 - 2.6 \times 10^{-26} on my calculator…

But this is for the modified version of the problem; I think funny things related to membership in B can still happen…

1 Like

what is test 30?