September challenge test generator please :)

Could somebody share their test generator for challenge? The 4-th test set is confusing me: the way it is described:

“After generating the triples, try again if any of them have equal elements.”

seems to run very long for me. Even checking right after each triple seems to take forever. So instead I did some weird search:

“After generating each triple, if the triple has equal elements, try again and add +1 to count_mistakes. Start again from scratch if count_mistakes > 64.”

Now it generated the test fast, but it felt like my local 4-th type tests were somehow very different still compared to the one visible test of 4th type. (sometimes very different scores)

My generator: http://ideone.com/H144pL

1 Like

How do you know there is a 4th type test included in the 3 visible test files. Following up, how do you separate the score for only the 4th type test file?

2 Likes

Well, for example I could submit such solution:

  1. If the test case is of 4-th type, solve it as best as I can.
  2. If the test case is of another type, first solve it, and then append random operations until a total of 10000 operations have been made.
    Since such a solution scored 22700 points, I made the conclusion that 4-th type existed and I scored 2700 on it.
    But this looked wrong, as all my local tests were making 4000+ operations (why only 2700?! how is this possible?! :D)
2 Likes

These are my Generator - https://www.codechef.com/viewsolution/26617066
(“testcase” namespace on line 217.)
I generated 4th test case wrong. Yeah its lot more confusing.

Reasons -

0. Lol

I ignored this “After generating the triples, try again if any of them have equal elements.”

  1. I failed to figure out that 4th test case is visible.
  2. My final score is 1.25 times my expected score.

I initially figured out that 1st, 3rd are hidden and 2nd, 5th are visible but since my score never increased so I thought 4th is hidden and 3rd is visible :stuck_out_tongue:

2 Likes

Oh that’s interesting. Maybe it’s a case of infinite monkeys typing a Shakespearean soliloquy :stuck_out_tongue: It’s possible that the test generated by the problem setter is a rare case where your algorithm works pretty well. How many tests did you try?

Oh wait. Maybe I missed something. “1. If the test case is of 4-th type”, how do you find this?

“How do I know the test is of 4th type?” - Consider the set of all values present in input. If the size of this set is < 3*n-10, then the test is of 4th type. (The value 10 is not special, anything in [0,20] should work with high probability too.)

“How many tests did you try?” - How many do you want me to try? :slight_smile: Let’s consider this submission (the one with ~22700 score during contest, now 85533: https://www.codechef.com/viewsolution/26589357).

This score 85533 means that both tests of 4th type scored ~ 5533/2 = 2766.5.

Now, if I run that solution through 100 tests, I get the following scores (sorted the order for convenience): http://ideone.com/fZjzxj

The smallest score out of 100 tests is 3377, total score 466227. I think this proves that:

“With very high probability, my test generator is different from official generator.”

Hence the question - what am I doing differently?

@aryanc403 - thanks!
I modified your generator to not ignore that rule and to output 100 tests of type 4: http://ideone.com/GZnwcv (it also ran infinitely long initially - so I added a similar weird search)
And ran those 100 tests on my code too: http://ideone.com/z7uF5v
Now these tests seem even harder on average - min score 3700, total score over 100 tests 491406.

The search continues: could someone provide a generator where this solution scores ~2700 on average like it does in official tests?

Are the official tests of 4th type wrong (by wrong I mean “not generated using the pseudocode described in problem statement”) or am I doing something dumb? :smiley:

2 Likes

Welcome :slight_smile:
Tagging @admin here maybe (s)he can help.
@alei @iscsi

1 Like

Hi Andres96,

Congratulation for the nice result. I think the 2,4 and 5 test case was used in the competition.

Here is the 4th test case generator:

vector<vector<uint64_t> > testCase4(int seed, int N)
{
	std::mt19937 generator(seed);
	std::uniform_int_distribution<int> dist(0, 1);
	const int K = 3;
	vector<vector<uint64_t> > fres(N, vector<uint64_t>(3));
	do
	{
		vector<vector<uint64_t> > res(K*N, vector<uint64_t>(3));
		res[0] = { getRandom64(dist, generator),getRandom64(dist, generator),getRandom64(dist, generator) };
		for (uint32_t i = 1; i < K * N; ++i)
		{
			vector<uint64_t>& pr = res[i];
			pr = res[i - 1];
			random_shuffle(pr.begin(), pr.end());
			const uint32_t MAXT = 10;
			uint64_t tmp = getRandom64(dist, generator) % MAXT;
			switch (tmp)
			{
			case 0:
				pr[0] |= pr[1];
				break;
			case 1: 
				pr[0] &= pr[1];
				break;
			case 2:
				pr[0] ^= pr[1];
				break;
			case 3:
				pr[0] += pr[1];
				break;
			case 4:
				pr[0] -= pr[1];
				break;
			case 5:
				pr[0] = pr[0] - (pr[0] & 0xFFFFFFFF) + ((pr[0] & pr[1]) & 0xFFFFFFFF);
				break;
			case 6:
				pr[0] = pr[0] - (pr[0] & 0xFFFFFFFF) + ((pr[0] | pr[1]) & 0xFFFFFFFF);
				break;
			case 7:
				pr[0] = pr[0] - (pr[0] & 0xFFFFFFFF) + ((pr[0] ^ pr[1]) & 0xFFFFFFFF);
				break;
			case 8:
				pr[0] = pr[0] - (pr[0] & 0xFFFFFFFF) + ((pr[0] + pr[1]) & 0xFFFFFFFF);
				break;
			case 9:
				pr[0] = pr[0] - (pr[0] & 0xFFFFFFFF) + ((pr[0] - pr[1]) & 0xFFFFFFFF);
				break;
			default:
				break;
			}
		}
		for (uint32_t i = 0; i < N; ++i)
			fres[i] = res[i*K];
	} while (!isTripletsValid(fres, N));
	return fres;
}

the isTripletsValid is checking the triplets not contains the same item more than once, and also the triplets are different.

2 Likes

Thanks!

Wait, random shuffle?! I thought “Add the values in the three registers (in a fixed order) to the list L.” would mean the registers don’t get shuffled and just the value of third register is used every time. (exactly the same as pwild: https://codeforces.com/blog/entry/69561?#comment-543257)

Well, the scores make perfect sense now, thanks! I just think this random shuffle should’ve been explicitly mentioned in the problem statement - very unexpected.

I am really sorry if the generation description was confusing.

Also it would be really interesting if you can share your approach in the editorial page or somewhere.

Wait, I think there is more difference besides the random shuffle: I think you add all 3 registers as one tuple, then make 3 moves, then add another tuple with all 3 registers?!

In this case, wouldn’t a solution be:

  1. set the registers to be equal to the first tuple
  2. for each next tuple, try all possibilities of the next 3 moves in O(60^3)

Wouldn’t that give a solution for 4th type that uses only ~300 operations?
How I filled in the blanks for your generator: http://ideone.com/LzGXTn
And here is above solution that seems to solve using only ~300 op: http://ideone.com/cxHcHH
But it fails when submitting online :confused: Anyway is this supposed to be a valid approach? I’m doing something different again?! :frowning:

the isTripletsValid is little bit different at least.

bool isTripletsValid(vector<vector<uint64_t> > tr, int N)
{
	for (size_t i = 0; i < tr.size(); ++i)
	{
		sort(tr[i].begin(), tr[i].end());
		if (tr[i][0] == tr[i][1])
			return false;
		if (tr[i][0] == tr[i][2])
			return false;
		if (tr[i][1] == tr[i][2])
			return false;
	}
	tr.erase(unique(tr.begin(), tr.end()), tr.end());
	if (tr.size() != N)
		return false;
	return true;
}

During the contest, I made the same mistaken interpretation as pwild as to what case 4 was; I thought it was the same register R over time (so first test triple is [R(t=0), R(t=1), R(t=2)], then the next triple is [R(t=3), R(t=4), R(t=5)] and so on). andres96 is correct that with the given generator, a simple 60**3 search would suffice for every triple after the first. In fact, because many possible operations within the 3 alloted could cancel each other out (like two xors), my local testing has many instances where only 1 or 2 operations are required. Thus, I get scores in the 230-260 range. I have submitted a modified version to the practice room version of the challenge problem to take advantage, but scores aren’t being calculated so that I can prove that this is working.

2 Likes