MEXUM subtask2 TLE, kindly suggest changes

mridulch · April 27, 2020, 9:24am

def power(a,b):
    if b==0:
        return 1
    res=power(a,int(b/2))
    if b%2==1:
        return res*res*a
    else:
        return res*res

t=int(input())
for i in range(0,t):
    n=int(input())
    a=list(map(int,input().split()))
    a.sort()
    d={}
    for i in range(0,n+2):
        if i==0:
            d[i]=1
        else:
            d[i]=0
    for i in range(0,n):
        if a[i]>n:
            a[i]=n+1
        d[a[i]]=d[a[i]]+1
    
    total=0
    val=1
    m=0
    #print(d)
    for i in range(1,n+2):
        val=val*(power(2,d[i-1])-1)
        m=m+d[i]
        total=total+i*(power(2,n-m))*val
        #print(i,"values",m,val,(2**(n-m))*val)
        
    print(total%998244353)

This was solution for MEXUM( MEXUM Problem - CodeChef )
solution link : CodeChef: Practical coding for everyone
I am getting a AC in subtask1 but TLE in subtask2.
Kindly Suggest Optimizations in the code.
( Thanks @anon71312354 for your help in correcting my modulo mistake )

anon71312354 · April 27, 2020, 9:56am

Modulo!?
I don’t know much about integer limits in Python, but the answer is going to be a huge one. I suggest you take modulo at every step.

Plus, the TL in bigger case is probably because of you exponent time complexity. Simply taking power is costly. Make your own modulo exponent function that works in logN.

const int MOD = 998244353;
typedef long long ll;
ll powN(ll a,ll p)
{
    if(p==0) return 1;
    ll z=powN(a,p/2);
    z=(z*z)%MOD;
    if(p%2) z=(z*a)%MOD;
    return z;
}

This is an example code for C++. I hope you can translate that to Python quite easily.

mridulch · April 27, 2020, 10:03am

thank you anand, the modulo mistake was really a stupid one, i will try the modulo exponent function and try the same

mridulch · April 27, 2020, 10:20am

anand i corrected the modulo mistake and got AC in the subtask1, and i also used binary exponentiation for the power function as suggested by you, still i am getting a TLE for subtask2.
Can anyone please suggest a even further optimization.

def power(a,b):
    if b==0:
        return 1
    res=power(a,int(b/2))
    if b%2==1:
        return res*res*a
    else:
        return res*res

t=int(input())
for i in range(0,t):
    n=int(input())
    a=list(map(int,input().split()))
    a.sort()
    d={}
    for i in range(0,n+2):
        if i==0:
            d[i]=1
        else:
            d[i]=0
    for i in range(0,n):
        if a[i]>n:
            a[i]=n+1
        d[a[i]]=d[a[i]]+1
    
    total=0
    val=1
    m=0
    #print(d)
    for i in range(1,n+2):
        val=val*(power(2,d[i-1])-1)
        m=m+d[i]
        total=total+i*(power(2,n-m))*val
        #print(i,"values",m,val,(2**(n-m))*val)
        
    print(total%998244353)

solution link: CodeChef: Practical coding for everyone

everule1 · April 27, 2020, 10:45am

I would recommend using the inbuilt pow
pow(a,b,c) gives a^b \mod c. Unlike C++, in python pow() is actually useful. Secondly take modulo at every step, not just the last step, because the numbers are becoming very big.

mridulch · April 27, 2020, 10:58am

i used the inbuilt pow function before this @everule1 but got TLE even then.
But i will try taking modulo at each step as you suggested.

everule1 · April 27, 2020, 11:02am

It won’t give you TLE if you pass c=998244353 and take mod at each step. Also you should probably declare some variable like mod=998244353.

mridulch · April 27, 2020, 11:14am

i did this and got an AC. Thank you @everule1
But i am not able to understand why this happened, because the complexity of the main code doesn’t change if i introduce a mod at each step end, that just helps to prevent overflow, how does the time taken reduce?

mod=998244353
t=int(input())
for i in range(0,t):
    n=int(input())
    a=list(map(int,input().split()))
    a.sort()
    d={}
    for i in range(0,n+2):
        if i==0:
            d[i]=1
        else:
            d[i]=0
    for i in range(0,n):
        if a[i]>n:
            a[i]=n+1
        d[a[i]]=d[a[i]]+1
    
    total=0
    val=1
    m=0
    #print(d)
    for i in range(1,n+2):
        val=val*(pow(2,d[i-1],mod)-1)
        m=m+d[i]
        total=(total+i*(pow(2,n-m,mod))*val)%mod
        #print(i,"values",m,val,(2**(n-m))*val)
        
    print(total%mod)

this is my new code

bravotango · April 27, 2020, 11:29am

In python numbers which cannot be handled by 64 bits are handled differently. Instead of being direrctly handled by the CPU, the calculations are performed in a much more tedious way. This makes addition,subtraction,multiplication,division and modulo on such huge numbers take a lot more time than numbers which fit in 64 bits.
You can read more about it here
https://www.reddit.com/r/Python/comments/39p57f/how_can_python_deal_with_bigger_numbers_than/

everule1 · April 27, 2020, 11:32am

There’s a very simple way to understand. What will take more time, multiplying 9 digit integers, or 10^5 digit integers?

mridulch · April 27, 2020, 12:26pm

Thank you @everule1 @bravotango, i understand the process now.