Weak test cases for PHCUL (NOV Long Challenge) and Unfair problem for Python users

bhagatdivesh21 · November 13, 2019, 4:45pm

Problem PHCUL of NOV Long Challenge 2019 has weak test cases.

Test cases generated here :

https://www.codechef.com/viewsolution/27876282
The solution in the above link fails to pass the above test case as Time complexity of the code in the worst case is O(N^3). But it still gets AC due to weak test cases.

Also, this problem was very unfair to Python users as one of my friend @ankushkhanna, when implemented his optimised code in Python in Practice Section (originally written in C++ during contest), it received TLE. While it got AC with very less time in C++. Also many Python users had hard time getting AC for PHCUL during contest.
Python Code : CodeChef: Practical coding for everyone
C++ Code : CodeChef: Practical coding for everyone
P.S. - Above C++ code passes the test cases stated in the beginning as its worst case Time Complexity can be said as approx O(N^2).

Heartiest thanks to @ankushkhanna for implementing the generator for the test case shown to him by me via a digram, building his C++ code in Python in most optimised version possible and helping me to verify that test cases are weak.

I request Admin/Problem Setter/Tester to kindly look into this matter.

top_contributo · November 13, 2019, 4:52pm

4th problem of div2 (CAMC),which was replacement of a problem which was removed due to already found on internet , was also already available on geeksforgeeks

anon2040365 · November 13, 2019, 5:07pm

No, the test cases aren’t weak. That O(n^3) solution is the optimized one.
Before every loop there is an if condition, than only the internal loops will run.

Even i too implemented using O(n^3).
Check my case,
My 50 points solution -
https://www.codechef.com/viewsolution/27718873
My 100 points solution(just added if condition as the solution link you provided, it ran in 0.0 second) - CodeChef: Practical coding for everyone

vijju123 · November 13, 2019, 5:11pm

Did not read the content of your post fully, but please get this clearly-

For any problem, you have a variety of languages. The languages can be very different from each other (esp. Compiled vs Interpretted vs Hybrid) which causes some language to be innately slow. Some languages have an edge over others in some aspects (C++ being faster than python, python having lot of inbuilt function and libraries etc). So its very hard to level this palying field.

Hence, especially with respect to python, it is widely known that not every problem can be solved in python despite being highly optimized. Its just too slow. And if you level that up some insane amount of time multiplier, you run risk of other problems as well (High time taken in giving results, big queue, People complaining python has tooo much of an edge over others).

So yeah, most probably nothing will be done from admin side because I myself do not find your claim that “Python solution must compulsorily pass” to be very strong.

ankushkhanna · November 13, 2019, 5:15pm

Well, I would request you to please check your solution again the test case generated by the Python script in the link provided by @bhagatdivesh21. I just tested your code on my system against that test, and I do have a pretty decent system specification. And it’s been more than 3 minutes now. So, it is a TLE even with C++14. You surely can verify that.

bhagatdivesh21 · November 13, 2019, 5:16pm

Your answer would also fail in the above listed test case pasted in Ideone.

ankushkhanna · November 13, 2019, 5:19pm

I do agree with what you are saying @vijju123, but you can even verify a C++14 code failing for this test set class.
The 100 point solution by @anon2040365 will definitely give a TLE for this test case generated by the above mentioned generator Python script. And this should not be the case, that really does imply the test cases were weak for this problem.

vijju123 · November 13, 2019, 5:40pm

Weak test case is something the contest admin should see. Ideally someone should tag the contest admin at the contest invitation thread and bring it to his notice.

My post was strictly on “Python soln not passing” part of the post.

ankushkhanna · November 13, 2019, 5:56pm

And I just let @anon2040365 solution run on my system and complete this test case, and it took a whopping 13 minutes and 30 seconds to pass this test. Whereas the C++14 solution that I submitted during the contest took just 0.48s for the same test on my system. Here is the proof.

This is @anon2040365 C++14 solution against the test case generated by the above Python Script.

Also, it is not in a fixed floating point format, I’m not sure if that’s a problem with CodeChef judge. I strongly think that it is. If not please correct me.

This one is the performance of my C++14 solution that I submitted during the contest.

The point is that both of them received a full score (100 points). And both of them have very different time complexities, and 13 minutes is nowhere justified, for this being a valid test case in all respects. So, I think there is no better way to express that the test cases were way to weak for this problem during the contest. There might be many other solutions which might not pass the second sub task, but have passed due to weak test cases. I strongly feel that something should be done about it.

@admin @vijju123 @yash_chandnani @alei @vladprog @watcher

mohit_11615192 · November 13, 2019, 6:09pm

hey my solution is in python with O(n**3*T) time complexity . but it is accepted as i am breaking it as soon as i get the calculated distance > min(distance calculated so far) .
here is the link : CodeChef: Practical coding for everyone

bhagatdivesh21 · November 13, 2019, 6:18pm

Bro, the Main point is the weak test case for PHCUL. Your solution is similar to code whose link I have posted in the beginning, and your solution would also fail in the above-listed test case (test case link posted in Ideone).
@ankushkhanna I guess you should edit this post. I think many people are getting distracted from the main point.

(Unfair to Python users was just another subpoint)
But for now, everyone just focus on the weak test case for the problem.

ssjgz · November 13, 2019, 6:25pm

Yes, that’s the exact situation the testcase in the OP is designed to break - it makes yours TLE (it’s been running for over 10 minutes so far)

Edit:

Here’s a direct link to the generated testcase for people who don’t want to run the python testcase generator.
Edit2:

Re: the secondary point about unfairness for Python users - take a look at the variance in these (apparently quite recent) benchmarks:

In the “pidigits” benchmark, the performance gap is quite narrow - python3 is approx 2x slower than C++.

In the “reverse-complement” benchmark, the gap is much more pronounced - python is about 10x slower.

In the “fannkuch-redux” benchmark, the slowdown is approx 53x.

The “mandelbrot” benchmark? \huge 171\text{x}!!!

So there’s really no sensible way of making things fair using a fixed multiplier - python’s slowdown depends very heavily on the task being performed.

devil_within · November 13, 2019, 7:46pm

Niceee!

bit_21 · November 13, 2019, 8:50pm

the problem SIMGAM was removed because it was previously used in a Codeforces contest, not because it was present on the internet. And I mean, if you are able to figure out that I need some approach which can give me smallest range from k lists in optimal time, and you understand the code on the geeks and implement the stuff, you are learning. That’s the motive behind the Long Challenge. But yeah, they could have chosen a more appropriate problem. Also CAMC was much difficult than SIMGAM (at least for me).

bit_21 · November 13, 2019, 8:53pm

that’s why I implement the approach I have in C++ too if I get TLE using Python (unless I know it’s a stupid brute force). Well PYPY is fast, use that.

anon2040365 · November 14, 2019, 12:05am

Wow buddy, I had thought too how in the world my code got passed. But seeing 0.0 seconds I thought during the contest maybe it is one of the solutions.