Challenge for those annoyed by weak test cases

carre · August 6, 2020, 6:49pm

I do not consider myself one of those users who usually complains about testcases but I found really interesting your challenge:

case

1 1
1 1

answer

0

spaanse · August 6, 2020, 6:56pm

WA; output=1
Although I should have stated that the problem from the course specified that it must be as simple graph; I will modify my first post. There is a test case for which my program fails without self-loops.

carre · August 6, 2020, 7:16pm

ok, let’s go on with edge cases without self loops then:

Complete graph

190 20
5 16
8 13
2 12
1 13
3 9
16 19
17 20
3 15
4 6
3 5
3 12
4 12
5 17
7 9
10 14
13 18
2 20
9 20
13 17
6 10
12 19
4 8
14 20
10 20
2 7
7 12
14 15
10 18
16 20
8 15
10 12
6 9
1 10
9 17
6 12
7 17
9 13
4 10
10 16
11 19
4 11
9 11
2 8
6 7
1 14
7 8
9 12
10 15
1 12
1 20
9 10
14 18
12 15
1 16
1 7
5 9
2 15
8 16
2 5
15 19
1 11
12 13
6 15
3 10
1 8
4 15
1 9
16 18
6 19
3 7
3 18
2 13
17 18
11 15
1 3
7 18
9 15
1 5
2 14
11 20
8 10
6 13
13 20
13 14
4 18
4 7
4 5
11 14
2 4
8 17
6 11
8 12
6 20
13 15
7 10
13 16
5 20
8 14
5 19
3 11
14 16
16 17
7 15
8 9
5 6
9 16
3 17
10 17
3 4
5 8
5 11
15 17
2 16
19 20
15 20
13 19
3 16
2 9
11 16
11 12
4 13
9 14
11 17
7 13
10 13
14 17
8 18
2 11
5 15
12 20
7 14
5 13
5 12
4 20
1 18
8 19
4 9
3 13
1 19
9 18
5 14
17 19
1 6
15 18
2 6
3 6
5 18
10 11
3 14
12 18
18 19
1 15
2 18
6 8
6 14
5 10
8 20
4 14
7 19
1 2
12 16
18 20
2 3
3 19
2 19
15 16
2 17
7 20
11 18
3 20
6 18
4 16
4 17
5 7
6 16
14 19
7 11
9 19
2 10
6 17
1 17
11 13
1 4
10 19
12 17
4 19
3 8
7 16
12 14
8 11

Answer=1

spaanse · August 6, 2020, 7:20pm

Not a valid test case:

Without that restriction the problem would have been NP-hard

carre · August 6, 2020, 7:22pm

ups sorry…yes!!
Anyway…check please in the post if N is the number of nodes or edges…because it’s said that it’s the number of edges but then in the input format it referes to the number of nodes

good_initiativ · August 6, 2020, 7:23pm

@ssrivastava990 , its a must to tag him here lol

spaanse · August 6, 2020, 7:36pm

the number of vertices and edges are indeed the wrong way around to what we would expect. From the original assignment:

The first line contains integers n, m and k, (1\leq n\leq 200, 1\leq m\leq 100, 1\leq k\leq 20) which denote the number of streets in Nijmegen, the number of intersections in Nijmegen and the number of bins which are already ordered. It is given that on every intersection a maximum of 4 streets meet.

I have removed the story behind the task; and removed the k value which was used to make it a decision problem instead of an optimization problem.

However I won’t mind if anyone accidentally swapped them.

carre · August 6, 2020, 7:52pm

Case

180 100
88 98
48 49
19 29
6 16
50 60
10 20
73 83
7 17
67 68
35 45
33 34
39 49
62 72
69 79
99 100
38 48
15 25
78 79
36 37
75 85
23 24
77 87
44 45
92 93
2 3
96 97
27 37
45 55
64 65
52 62
63 73
90 100
86 96
18 19
33 43
85 86
38 39
21 22
54 64
11 21
45 46
43 53
12 22
5 6
72 82
59 69
80 90
31 32
23 33
87 88
86 87
93 94
21 31
78 88
14 15
25 26
2 12
26 36
17 18
47 57
18 28
12 13
61 62
95 96
52 53
42 43
17 27
68 78
28 29
73 74
40 50
8 9
13 23
42 52
48 58
24 25
25 35
4 5
75 76
57 67
1 2
3 13
57 58
70 80
74 84
65 66
30 40
15 16
85 95
1 11
20 30
97 98
88 89
82 92
34 44
46 56
36 46
89 99
44 54
14 24
49 59
66 67
46 47
49 50
29 30
69 70
35 36
22 23
98 99
6 7
56 57
68 69
83 84
29 39
58 68
24 34
84 85
82 83
58 59
91 92
8 18
41 42
54 55
55 56
51 52
4 14
71 81
16 26
9 10
61 71
76 86
28 38
94 95
74 75
67 77
81 91
13 14
11 12
32 42
7 8
59 60
9 19
39 40
27 28
66 76
71 72
77 78
5 15
51 61
19 20
62 63
89 90
22 32
84 94
16 17
32 33
83 93
76 77
56 66
60 70
65 75
34 35
72 73
55 65
41 51
81 82
37 47
79 89
64 74
53 54
63 64
47 48
31 41
53 63
26 27
3 4
43 44
87 97
37 38
79 80

Answer=50

hari_2666 · August 6, 2020, 8:03pm

codeforces has good test cases almost all the time
and what is wrong when community is annoyed???
they expect good testcases for contests
thats not wrong
not able to produce strong test cases almost in every contest is codechef testers issue.

I mean if a company releases a defective product, its their issue.
it cannot just challenge everyone saying its tough

PS: i am not against u…i am just against codechef

spaanse · August 6, 2020, 8:07pm

I assume that case is a 10x10 grid
And then
WA ; result = 49 is given as a result.

What I haven’t mentioned is that my approach was non-deterministic.
It would still fit in the time limit to do 10x as many checks in which case it would receive AC, but I think that would be cheating on my part.

The test case I had in mind was a straight line. It was funny for me to make an algorithm that could pass a lot of the difficult test cases but fail for the simple ones

In my final submission I had an extra step before running the non-deterministic part to prevent the line from giving a WA; it seems your test case also defeats the better algorithm.

carre · August 6, 2020, 8:14pm

I hope my intuitive answer is correct

If that’s the case talk to you school professor to hire mi as a tester!

spaanse · August 6, 2020, 8:39pm

Personally I am not annoyed by weak test cases. I do however notice in this forum that there is a lot of complaining about weak test cases. The point I want to make is that while complaining and finding weak test cases afterwards is easy, thinking about them beforehand is hard.

Of course problem setters should try their best at making good test cases. After a recent contest I saw a setter explain why some of his test cases were wrong/weak, and I think he should be applauded for that. I believe we should be more open to work other people do, how difficult that is, and appreciate them from doing it.

Then about Codeforces vs Codechef: I like this platform better. I like algorithmic thinking a lot more than implementing. I believe this platform is a better place for that with nice problems, the long contest and especially the good editorials. From the few times I participated on Codeforces I started disliking it. It feels it is more about making a robust implementation in a short amount of time than developing algorithms. Especially with the extra focus on the hacking phase.

carre · August 6, 2020, 11:28pm

Do you consider weak test cases to be a frequent issue in codechef?

rahulmysuru7 · August 7, 2020, 1:11am

Yes,Weak Testcases did happen frequently.
June long, GUESSG and CONVAIR
July long, LCM Constraints.
May cookoff CHEFSHIP
@carre
I do appreciate the effort of the setter.
But,during long contest before 5th day, if weak test cases are pointed out on forum, please try to add those test cases for long challenges

carre · August 7, 2020, 2:12am

GUESSG? I don’t remember reading about many unintended approaches that got 100% points on that problem. It may be the case that I missed the post. I remember only 1 approach discussed. Obviously, I do not consider cases to be weak if only a few unintended approaches succeed in obtaining AC. I don’t comment on the others problems you mention because I don’t know, but I’m surprised to see this one on the list.

everule1 · August 7, 2020, 3:14am

Did you have some reasoning to the test case you constructed?

piyush33patel · August 7, 2020, 3:18am

Let me add few more.
COOK116B - CHCBOX
LTIME76B - CATFEED
See, I know it is impossible to predict all wrong submissions. No one can do this before hand. So why not let the community help. Why there’s no hacking phase. Let the participants help. Codechef spends so much time and effort on plagiarism checking (As they said in one of their recent blogs). It’s useless.

“our sole intention lies in making our users learn new concepts while competing”
-Codechef’s motto.

But if you learn a concept wrongly, it’s more dreadful than cheating.

Now think like this : there are three accepted submissions.

A cheater - who just blindly copied a code.
A person - who submitted the correct code.
A person - who submitted the incorrect code - thinking his code is correct.
Who is at the bigger loss?
Definitely the third person. The cheater doesn’t know the concept at all. He might learn in future but what about the third person!!

I really think codechef should implement such alternative to capture weak testcases specially in short contests(it would be idiot to implement it in long for obvious reasons). It will also reduce excessive work of testers.(hopefully!)

akshitm16 · August 7, 2020, 3:54am

I don’t think CONVAIR had weak tc.

carre · August 7, 2020, 4:02am

a grid of 10 x 10. Each cell connected with its right and its down neightboard (if exist).

The solution I imagined was picking even cols of even rows and odd cols of odd rows

rahulmysuru7 · August 7, 2020, 11:51pm

Continuous random numbers did gave ac for few sub-tasks with repeated submissions,which shouldn’t have been possible for GUESSG