pypy slower than cpython

I was trying to solve CHEFGP Problem - CodeChef using python, as usually pypy is faster, I submitted my solution(based on editorial) in pypy, but I got TLE, for substak1 whereas when I submitted same code through cpython 2, it passed under a second. This seems quite strange as I didn’t use any special features, just string manipulations and loops. Can anybody explain this strange behavior

pypy submission: CodeChef: Practical coding for everyone

cpython submission: CodeChef: Practical coding for everyone

The problem lies in your code. Python strings are not mutable as in C++, so an operation like s += t for string s and t in a loop creates a new string each time leading to quadratic complexity. So a TLE is the expected result. However, according to the documentation (note 6)

CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use the str.join() method which assures consistent linear concatenation performance across versions and implementations.

Since the optimization is not guaranteed, I think it is better to avoid it. One alternative is to append the pieces one by one to a list and then use str.join() at the end as mentioned above.

1 Like

This is REAL evil. Especially for guys who dont read documentation, just learn syntax and start coding in a language :3

Thanks.I tried submitting solution using ‘’.join((s,t)) but it further slowed down the program. I also tried using list and in the end append all the pieces and it worked.
Why would str.join() work faster than s += t? isn’t it required in both cases to iterate on all characters?

using str.join() : CodeChef: Practical coding for everyone

using list : CodeChef: Practical coding for everyone

Sorry for the late reply. You’re missing the key fact: Python strings are immutable, which means they cannot be modified.

Imagine the strings are lists of characters, but they cannot be modified. So whenever you join two lists you must make a new list which has contents of both the lists. I hope you can see now how this is costly? But if you are allowed to modify the lists, it works as you have seen.