September challenge test generator please :)

andres96 · September 16, 2019, 4:57pm

Could somebody share their test generator for challenge? The 4-th test set is confusing me: the way it is described:

“After generating the triples, try again if any of them have equal elements.”

seems to run very long for me. Even checking right after each triple seems to take forever. So instead I did some weird search:

“After generating each triple, if the triple has equal elements, try again and add +1 to count_mistakes. Start again from scratch if count_mistakes > 64.”

Now it generated the test fast, but it felt like my local 4-th type tests were somehow very different still compared to the one visible test of 4th type. (sometimes very different scores)

My generator: H144pL - Online Python3 Interpreter & Debugging Tool - Ideone.com

amoghk · September 16, 2019, 5:21pm

How do you know there is a 4th type test included in the 3 visible test files. Following up, how do you separate the score for only the 4th type test file?

andres96 · September 16, 2019, 6:00pm

Well, for example I could submit such solution:

If the test case is of 4-th type, solve it as best as I can.
If the test case is of another type, first solve it, and then append random operations until a total of 10000 operations have been made.
Since such a solution scored 22700 points, I made the conclusion that 4-th type existed and I scored 2700 on it.
But this looked wrong, as all my local tests were making 4000+ operations (why only 2700?! how is this possible?! :D)

amoghk · September 16, 2019, 6:15pm

Oh that’s interesting. Maybe it’s a case of infinite monkeys typing a Shakespearean soliloquy It’s possible that the test generated by the problem setter is a rare case where your algorithm works pretty well. How many tests did you try?

amoghk · September 16, 2019, 6:18pm

Oh wait. Maybe I missed something. “1. If the test case is of 4-th type”, how do you find this?

andres96 · September 16, 2019, 9:54pm

“How do I know the test is of 4th type?” - Consider the set of all values present in input. If the size of this set is < 3*n-10, then the test is of 4th type. (The value 10 is not special, anything in [0,20] should work with high probability too.)

“How many tests did you try?” - How many do you want me to try? Let’s consider this submission (the one with ~22700 score during contest, now 85533: CodeChef: Practical coding for everyone).

This score 85533 means that both tests of 4th type scored ~ 5533/2 = 2766.5.

Now, if I run that solution through 100 tests, I get the following scores (sorted the order for convenience): fZjzxj - Online IDE & Debugging Tool - Ideone.com

The smallest score out of 100 tests is 3377, total score 466227. I think this proves that:

“With very high probability, my test generator is different from official generator.”

Hence the question - what am I doing differently?

@aryanc403 - thanks!
I modified your generator to not ignore that rule and to output 100 tests of type 4: GZnwcv - Online C++0x Compiler & Debugging Tool - Ideone.com (it also ran infinitely long initially - so I added a similar weird search)
And ran those 100 tests on my code too: z7uF5v - Online IDE & Debugging Tool - Ideone.com
Now these tests seem even harder on average - min score 3700, total score over 100 tests 491406.

The search continues: could someone provide a generator where this solution scores ~2700 on average like it does in official tests?

Are the official tests of 4th type wrong (by wrong I mean “not generated using the pseudocode described in problem statement”) or am I doing something dumb?

iscsi · September 17, 2019, 10:47am

Hi Andres96,

Congratulation for the nice result. I think the 2,4 and 5 test case was used in the competition.

Here is the 4th test case generator:

vector<vector<uint64_t> > testCase4(int seed, int N)
{
	std::mt19937 generator(seed);
	std::uniform_int_distribution<int> dist(0, 1);
	const int K = 3;
	vector<vector<uint64_t> > fres(N, vector<uint64_t>(3));
	do
	{
		vector<vector<uint64_t> > res(K*N, vector<uint64_t>(3));
		res[0] = { getRandom64(dist, generator),getRandom64(dist, generator),getRandom64(dist, generator) };
		for (uint32_t i = 1; i < K * N; ++i)
		{
			vector<uint64_t>& pr = res[i];
			pr = res[i - 1];
			random_shuffle(pr.begin(), pr.end());
			const uint32_t MAXT = 10;
			uint64_t tmp = getRandom64(dist, generator) % MAXT;
			switch (tmp)
			{
			case 0:
				pr[0] |= pr[1];
				break;
			case 1: 
				pr[0] &= pr[1];
				break;
			case 2:
				pr[0] ^= pr[1];
				break;
			case 3:
				pr[0] += pr[1];
				break;
			case 4:
				pr[0] -= pr[1];
				break;
			case 5:
				pr[0] = pr[0] - (pr[0] & 0xFFFFFFFF) + ((pr[0] & pr[1]) & 0xFFFFFFFF);
				break;
			case 6:
				pr[0] = pr[0] - (pr[0] & 0xFFFFFFFF) + ((pr[0] | pr[1]) & 0xFFFFFFFF);
				break;
			case 7:
				pr[0] = pr[0] - (pr[0] & 0xFFFFFFFF) + ((pr[0] ^ pr[1]) & 0xFFFFFFFF);
				break;
			case 8:
				pr[0] = pr[0] - (pr[0] & 0xFFFFFFFF) + ((pr[0] + pr[1]) & 0xFFFFFFFF);
				break;
			case 9:
				pr[0] = pr[0] - (pr[0] & 0xFFFFFFFF) + ((pr[0] - pr[1]) & 0xFFFFFFFF);
				break;
			default:
				break;
			}
		}
		for (uint32_t i = 0; i < N; ++i)
			fres[i] = res[i*K];
	} while (!isTripletsValid(fres, N));
	return fres;
}

the isTripletsValid is checking the triplets not contains the same item more than once, and also the triplets are different.

andres96 · September 17, 2019, 11:58am

Thanks!

Wait, random shuffle?! I thought “Add the values in the three registers (in a fixed order) to the list L.” would mean the registers don’t get shuffled and just the value of third register is used every time. (exactly the same as pwild: CodeChef September Long Challenge 2019 - Codeforces)

Well, the scores make perfect sense now, thanks! I just think this random shuffle should’ve been explicitly mentioned in the problem statement - very unexpected.

iscsi · September 17, 2019, 12:38pm

I am really sorry if the generation description was confusing.

Also it would be really interesting if you can share your approach in the editorial page or somewhere.

andres96 · September 17, 2019, 1:35pm

Wait, I think there is more difference besides the random shuffle: I think you add all 3 registers as one tuple, then make 3 moves, then add another tuple with all 3 registers?!

In this case, wouldn’t a solution be:

set the registers to be equal to the first tuple
for each next tuple, try all possibilities of the next 3 moves in O(60^3)

Wouldn’t that give a solution for 4th type that uses only ~300 operations?
How I filled in the blanks for your generator: LzGXTn - Online C++0x Compiler & Debugging Tool - Ideone.com
And here is above solution that seems to solve using only ~300 op: http://ideone.com/cxHcHH
But it fails when submitting online Anyway is this supposed to be a valid approach? I’m doing something different again?!

iscsi · September 17, 2019, 2:44pm

the isTripletsValid is little bit different at least.

bool isTripletsValid(vector<vector<uint64_t> > tr, int N)
{
	for (size_t i = 0; i < tr.size(); ++i)
	{
		sort(tr[i].begin(), tr[i].end());
		if (tr[i][0] == tr[i][1])
			return false;
		if (tr[i][0] == tr[i][2])
			return false;
		if (tr[i][1] == tr[i][2])
			return false;
	}
	tr.erase(unique(tr.begin(), tr.end()), tr.end());
	if (tr.size() != N)
		return false;
	return true;
}

narri · September 18, 2019, 2:53am

During the contest, I made the same mistaken interpretation as pwild as to what case 4 was; I thought it was the same register R over time (so first test triple is [R(t=0), R(t=1), R(t=2)], then the next triple is [R(t=3), R(t=4), R(t=5)] and so on). andres96 is correct that with the given generator, a simple 60**3 search would suffice for every triple after the first. In fact, because many possible operations within the 3 alloted could cancel each other out (like two xors), my local testing has many instances where only 1 or 2 operations are required. Thus, I get scores in the 230-260 range. I have submitted a modified version to the practice room version of the challenge problem to take advantage, but scores aren’t being calculated so that I can prove that this is working.