CHEFQUE - Editorial

bitset
cook65
easy-medium
editorial
hashing

#1

PROBLEM LINK:

Practice
Contest

Author: Misha Chorniy
Tester: Jingbo Shang
Editorialist: Pushkar Mishra

DIFFICULTY:

Easy-Medium

PREREQUISITES:

Hash-Table

PROBLEM:

Given two types of queries:

  1. Add the number to the set of numbers being maintained.
  2. Remove the given number from the set if it exists in it.

The numbers to be inserted/deleted are generated according to the following scheme:
S = (a*S_{i-1} + b) mod 2^{32},
where a and b are integers.

The sum of the numbers finally remaining in the set after all the operations is to be reported.

EXPLANATION:

Given the number of operations to be executed, i.e., q = 10^7, it should be clear that the insertions and deletions from the set are to be done in \mathcal{O}(1). Use of STL or pre-written libraries in C++/Java can’t be used since they will definitely time out because of associated overheads.

Hash Table Solution
One way is to write our own hash table. Editorialist’s solution provides a standard implementation of Hash-Table with chaining to handle collision. For chaining, linked list has been used in the given solution. You can read more about it over the internet.

Bitset Solution
A neater and more concise solution is using Bitset. Let’s think about the naive solution to this problem first. If we could simply keep an array of length 2^{32}-1, then direct insert/lookup in the array would have been possible. The the memory constraints don’t allow that big an array. However, let us look at the constraints on the number being inserted/deleted into the set. We know they lie between 0 and 2^{32}-1, i.e., they can be represented by 31 bits. Now, we can use a smart way to index numbers. Let us keep an array of length 2^{26}. This is perfectly fine with the memory limits. Now, for a 31 bit integer x, let the first 26 bits indicate the index in the array where x will reside. The rest 5 bits can be encoded in a 32 bit integer since only different 2^5 possibilities exist. Please read setter’s/tester’s solutions for this approach.

There is one more intricacy here: calculating S_i. If we use the normal mod operation present in some of the programming languages, the solution will exceed time limit. This is because mod is a complex operation and performing it 10^7 times will not work. The observation is that we have to take mod 2^{32}. If we keep the data type of S, a, b to be unsigned, in that case, we will not have to use mod operation because the range of unsigned int is 0 to 2^{32}-1. Thus, if S goes over 2^{32} it will automatically overflow around to greater than 0. In other words, the compiler will perform the mod function implicitly because of the overflow.

COMPLEXITY:

\mathcal{O}(1) per insert and delete operation.

SAMPLE SOLUTIONS:

Author
Tester
Editorialist


#2

Umm…I declared a bool vector of length 2^(31) and got AC. Is it possible to do so or I just got lucky here? :stuck_out_tongue:

My solution: https://www.codechef.com/viewsolution/8987936


#3

@pranavarora
vector < bool > <-> bitset in C++ in terms of memory, vector < bool > v(N) use N / 8 bytes.


#4

Did it using my own definition of hashtable by handling of collisions through chaining implemented through linked list.
Solution link:https://www.codechef.com/viewsolution/8988520


#5

I got segfault with the same idea as @pranavarora and tle with the same idea as editorialist.


#6

I got it accepted using hashtable approach mentioned above same as editorialist’s way…


#7

The timelimit wasn’t struct enough to prohibit all O(Nlog(N)) approaches. Sorting worked in time.
Put the ith queries in a array as triplet (no, i,b), where no is the number of element and b 1 for insertion and 0 for delete. After sorting counting is easy in a single pass.


#8

@ashish1610

Because ar.resize() allocate many memory, and stack not enough for it. You should allocate memory globally.
http://pastie.org/10643748


#9

Was really a tough question to crack. I had heard of bitset earliar but never used it. Tried to read it concept during the contest but couldn’t understand it properly. But then got a link to this https://discuss.codechef.com/questions/72572/marking-large-integers which helped a lot. I also tried to use unordered map (which is available in C++14 as C++4.9.2 gave compile error) but in some worst case in give O(n) complexity due to rehashing. So, I avoided it.

Finally, a very naive implementation of the idea given in the above link helped to get AC. (although the 2ns input provided- 10000000 777777777 777777777 777777777, took 2.16 sec on Codechef ide and 2.37 sec on my computer to run). So, just to try my luck, at the last moments of the contest, i made and submission and was happy to get AC on the first attempt.

Here is a link to my ACed solution https://www.codechef.com/viewsolution/8989357


#10

@mgch see my first submission. Link : https://www.codechef.com/viewsolution/8988268

And this is my solution with hashing : https://www.codechef.com/viewsolution/8988506


#11

@ashish1610
bool arr[N] and vector < bool > arv(N) it’s different things, arr use N bytes of memory, arv use N / 8 bytes.


#12

@mgch Yeah got that difference. But still no idea why hashing failed.


#13

@ashish1610

Because you have many operations with vector, if you firstly do v*.reserve for all i = 0…10^7.
I’ve added two optimizations, and your code got AC:

  1. First vector < ll > -> vector < int >, long long works very long
  2. Choose modulo as 2^N, then % mod <-> & (mod - 1), modulo operations works long too
    http://pastie.org/10643774

#14

when will the problems be made public for practice?


#15

When I run this code https://www.codechef.com/viewsolution/8987559 on my machine, codechef compile and run or ideone it is showing memory exhausted so how do I run this code?


#16

The problems are not available for practice :frowning:


#17

Can this question be solved using BitSet Class in Java?
If so anyone can tell its implementation?


#18

this problem shows how much u know about max size of arrays in programming .
we can make 2^30 size bool array in c++
but if we use vector then this size goes up to 2^33, in this question we need 2^31 bool values,
so this problem can be solved by using

vector a(1LL<<31L,false);

here is my solution
https://www.codechef.com/viewsolution/8990985


#19

Each solution link has been broken… :frowning:


#20

I’m getting wrong answer in this code. :frowning: I’m trying hashing with linked list. Would anybody please give a look?

ideone link