INTEREQ - Editorial

PROBLEM LINK:

Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3

Author: Anshu Garg
Tester: Shubham Anand Jain
Editorialist: Nishank Suresh

DIFFICULTY:

Easy-Medium

PREREQUISITES:

Binary search

PROBLEM:

There is a hidden array A of size N. You need to find an array B which is equivalent to A, where B is considered equivalent to A if A_i = A_j \iff B_i = B_j for every 1\leq i, j\leq N.
To do this, you can ask upto 6N queries, where each query is as follows:
Provide any subset \{i_1, i_2, \dots, i_k\} of \{1, 2, \dots, N\} to the judge, which then returns the frequency of the mode of the multiset \{A_{i_1}, A_{i_2}, \dots, A_{i_k}\}.

QUICK EXPLANATION:

Build the array one index at a time. If the first i elements are known, the i+1-th can be found by binary searching on the distinct elements known so far.

EXPLANATION:

Suppose the original array A has d distinct elements. Then, it is possible to create an equivalent array B which only contains 1, 2, \dots, d, which is what we will attempt to do.

Let us analyze the given query operation first.
Playing around with it a bit should lead you to the following observations:

  • If you provide a set of any size, and the judge returns 1, all elements of the set are distinct.
  • As a special case of the above observation, if you provide a set of size 2, the elements of the set are distinct if and only if the judge returns 1.

We will use this to build a solution.
First, without loss of generality, we can assume that A_1 = 1.

Now let us try to find A_2.
With the information we have, the only reasonable query to ask is \{1, 2\}. If the response to this query is 2, A_2 must be equal to A_1; else it must be different. In case it is different, we can set A_2 = 2 w.l.o.g.

What about A_3?
There are two possibilities here: either A_3 equals one of A_1 or A_2, or it is distinct from them.
One way to check this is to use a similar strategy as we used to find A_2 - querying the set \{1, 3\} tells us whether A_1 = A_3, and querying the set \{1, 2\} tells us whether A_1 = A_2. This is enough information to determine A_3.

The above algorithm generalizes to a full solution, albeit one which uses too many queries.
For each index i, query each of \{1, i\}, \{2, i\}, \dots, \{i-1, i\}. This determines A_i.
Of course, this uses \mathcal{O}(N^2) queries, which is too much.

One minor optimization here is to not repeat queries: if A_x = A_y, there is no need to query for both \{x, i\} and \{y, i\} since the result is going to be the same.
In other words, it is enough to query one index corresponding to each distinct element found so far.
This by itself is not really enough to bring the number of queries down significantly - consider an array where all elements are distinct, for example. Keep the idea in mind though, it will help us later.

Reducing the number of queries

Currently, our solution uses i-1 queries to find A_i. How to do better?
A hint on how to accomplish this comes from the limit on the number of queries (and maybe even the fact that the problem is interactive…). 6 is about \log_2 100, which brings to mind binary search!

What to binary search on?
Looking back at our initial solution, it can be broken into two parts: figuring out whether A_i is a new element, and if it isn’t, figuring out what it is.
Remember the earlier optimization of only asking about one index for each distinct element? That will come into play here to help us.

Suppose we want to find A_i, and there have been k distinct elements found so far. Let I = \{i_1, i_2, \dots, i_k\} be a set of indices corresponding to these k elements.

  • To check whether A_i is new, it suffices to ask the query \{i_1, i_2, \dots, i_k, i\}
Why?

Note that, since A_{i_1}, A_{i_2}, \dots, A_{i_k} are all distinct, the result of this query is either 1 or 2. If it is 1, A_i is distinct from all of them and is hence new.

  • If A_i does happen to be new, we assign it the value k+1 and insert i into the set of representatives.
  • Now consider the case when A_i isn’t new. There will be exactly one x \in \{i_1, i_2, \dots, i_k\} such that A_x = A_i because the indices represent distinct elements.
    Finding this x can be done with a simple binary search. Once we know x, we can assign A_i = A_x
Details on the binary search

Split I into two halves - say I_1 and I_2.
Note that x is in one of these halves, but not both.
To check whether x \in I_1, query for I_1 \cup \{i\}. If the result of this query is 1, x \notin I_1. If it is 2, x \in I_1.
This reduces the search space by half, continue applying this process till the search space becomes a single element - that is x.

To make implementation simple, notice that it doesn’t really matter what I_1 and I_2 are chosen to be, as long as they are disjoint halves of I. In particular, we can choose them to be intervals and maintain lo and hi pointers as in a normal binary search.

Analyzing the number of queries

We ask one query for each index 2\leq i\leq N to find out whether A_i is new or not. If it isn’t, we run a binary search to find A_i.
If the array has d distinct elements, the binary search step runs N-d times. In the worst case, each one is over a set of size d, and so will require \lceil \log_2 d\rceil queries.
Thus, the total number of queries is bounded above by N-1 + (N-d)\log_2 d.
It can be verified (for example, by writing a program to compute this over all values of N and d) that this value is much less than 6N no matter what d is. For N = 100, the maximum value across all d comes out to be 514, so the limit on the number of queries is not tight at all.

TIME COMPLEXITY:

\mathcal{O}(N\log N) per testcase

SOLUTIONS:

Tester's Solution
//By TheOneYouWant
#include <bits/stdc++.h>
using namespace std;
#define fastio ios_base::sync_with_stdio(0);cin.tie(0)

vector<int> curr;

void ask(int l, int r, int ext){

	// curr (l to r), and ext

	vector<int> to_ask;
	for(int i = l; i <= r; i++){
		to_ask.push_back(curr[i]);
	}
	to_ask.push_back(ext);
	cout << "? " << to_ask.size() << " ";
	for(int i = 0; i < to_ask.size(); i++){
		cout << to_ask[i] << " ";
	}
	cout << endl;
}

int main(){

	fastio;

	int tests;
	cin >> tests;

	while(tests--){

		int n, q;
		cin >> n >> q;

		curr.clear();
		curr.push_back(1);
		int type[n+1];
		for(int i = 0; i <= n; i++) type[i] = 0;
		type[1] = 1;
		int num = 2;

		for(int i = 2; i <= n; i++){
			int l = 0, r = (int)curr.size() - 1;
			ask(l, r, i);
			int ans; 
			cin >> ans;

			if(ans == 1){
				type[i] = num;
				curr.push_back(i);
				num++;
				continue;
			}
			while(l<r){
				int mid = (l+r)/2;
				ask(l, mid, i);
				int ans;
				cin >> ans;
				if(ans == 1){
					l = mid + 1;
				}
				else{
					r = mid;
				}
			}
			type[i] = type[curr[l]];
		}
		cout << "! ";
		for(int i = 1; i <= n; i++){
			cout << type[i] << " ";
		}
		cout << endl;
		int is_correct;
		cin >> is_correct;
		assert(is_correct == 1);
	}

	return 0;
}
Editorialist's Solution
#include "bits/stdc++.h"
// #pragma GCC optimize("Ofast,unroll-loops")
// #pragma GCC target("sse,sse2,sse3,ssse3,sse4,popcnt,mmx,avx,avx2")

using namespace std;
using ll = long long;

int main()
{
    ios::sync_with_stdio(0); cin.tie(0);

    int t; cin >> t;
    while (t--) {
        int n, q; cin >> n >> q;
        vector<int> ans(n); ans[0] = 1;
        vector<array<int, 2>> active = {{1, 1}};
        auto query = [&] (int L, int R, int x) {
            cout << "? " << R-L+2 << ' ';
            for (int i = L; i <= R; ++i)
                cout << active[i][1] << ' ';
            cout << x << endl;
            int mode; cin >> mode;
            return mode;
        };

        int id = 2;
        for (int i = 1; i < n; ++i) {
            if (query(0, size(active) - 1, i+1) == 1) { // First occurrence of a[i]
                ans[i] = id++;
                active.push_back({ans[i], i+1});
                continue;
            }
            
            int L = 0, R = size(active) - 1;
            while (L < R) {
                int mid = (L + R)/2;
                if (query(L, mid, i+1) == 1) { // a[i] is not equivalent to anything in [L, mid], so it must be equivalent to something in [mid+1, R]
                    L = mid+1;
                }
                else { // a[i] is equivalent to something in [L, mid]
                    R = mid;
                }
            }
            ans[i] = active[L][0];
        }
        cout << "! ";
        for (auto x : ans)
            cout << x << ' ';
        cout << endl;

        int result; cin >> result;
    }
}
7 Likes

Really beautiful problem, the elegance and simplicity of the solution is utterly amazing.

5 Likes

Can anyone point out why my code CodeChef: Practical coding for everyone getting WA, though ik that it is printing the correct array but maybe queries I have asked more than 600.
But according to my calculations for every index I am doing log(n) queries which should be less than 600 I guess.

Intuition about my code is first I am finding the first index where we encounter an duplicate then I just assigned all index with unique which are less than duplicate index and have not assigned before, now to check what should be the duplicate value I traversed in the vector where I have stored the unique elements.

Solution: 50113086 | CodeChef check out my debugging code, it prints each step what i am doing and initializing. this will make things easier for u to understand the logic

Nice problem.

1 Like

You indeed use more than 600 queries.
I ran your code on a testcase with N = 100, Q = 600 and the array defined by

A[i] = i for 1 <= i <= 21
A[i] = 22 for 22 <= i <= 100

and it takes 745 queries.
From what I can see, you use ~9 queries at almost every position, which is a bit too high here.

Here is the code modified to run on this testcase, it also prints the number of queries at the end.

(Side note: in the future, when asking for help with debugging your code please try to minimize the use of macros if possible - they make the code pretty hard to read and parse)

1 Like

Ohh, somehow I was not able to get the upper bound of my program. Acha by the way how to run these large input N=100, like doing it manually is itself a task.
Aur, for next time I will try make the code more readable to the other person. Thanks for time.

I didn’t run your code manually on N = 100. If you look at the link I sent, I modified your code slightly so that the answer to a query is computed within the program itself.

In general you can write an interactor yourself which will provide the result given a query.

Very nice problem and awesome editorial !

2 Likes

I think there is a little typo in the last line of simplified problem statement.

which then returns the mode of the multiset

The judge doesn’t return the mode of the multiset but only the frequency of the mode.

You’re right, thanks for letting me know.
Fixed.

1 Like

My Solution is giving a wrong answer, please anybody help me with this.

Your link is wrong.

Looking at your latest submission, it seems you missed taking input after printing the answer to each testcase.
From the problem statement:

Now you must read an integer denoting that your answer is correct or not.
If your answer is correct, judge prints ‘1’. In this case, you should continue solving the remaining test
If your answer is incorrect, judge prints ‘-1’ and exits with wrong answer verdict. You must also terminate your program in this case otherwise, you may receive any verdict.

Ohh sorry updated now!!

thankyou so much, Latest one but still it giving the wrong answer, please help me out if possible

In your binary search, ans is not initialized and you don’t ever set its value so pre[ans] probably returns garbage (and running your code locally gives me a runtime error for this reason)

Thank you sir!! Accepted

this editorial became of my favourites of all time in codechef
thak you so much

1 Like