COOLCHEF - EDITORIAL

PROBLEM LINK:

Practice

Contest: Division 1

Contest: Division 2

Setter: Reshab Gupta

Tester: Radoslav Dimitrov

Editorialist: Teja Vardhan Reddy

DIFFICULTY:

Medium

PREREQUISITES:

Square root decompostion.

PROBLEM:

Given array A of n numbers. Answer q queries of the following type on it in an online fashion (i.e you get the next query only after previous query is answered).

Query: given l,r. number of different orderings of the subarray from l to r (two orderings are different if value at some position is different).

EXPLANATION

Lets us see what will the answer for an array B having k distinct elements with ith element occuring C_i times in B. So B has a total of \sum\limits_{i=1}^kC_i (lets say m) elements in it.
Now, total number of orderings will be \frac{m!}{\prod\limits_{i=1}^kC_i!}. We can see this as first we chose positions of B_1 , then B_2 among left and so on.

Now to answer a query , we need the denominator mainly because numerator will simply be (r-l+1)!.

So, denominator will be product of factorial of frequencies of elements.

Since we only focus on frequencies, we can map down all the numbers in the array to 1 to n.
From now on , we will only focus on calculating the denominator.

Assume if we need not answer queries in an online fashion, we could have directly used the MO algorithm to solve the problem in O((n+q)\sqrt{n}). (Subtask 2 can be solved using this). Resource for MO algorithm.

Also there is a online variant of MO algorithm which can be directly applied here to solve the problem in O((n + q) * n^{2/3}) complexity. (Subtask 3 can be solved using this) Resource 1 Resource 2.

Since the intended solution for full points is totally different, I have added links to blog posts of MO algorithm and online version of it regarding its resources.

Now we will see how to solve it for 100 pts.

Let us divide the array into blocks of size B each. there will be a total of n/B blocks.
Let us denote ans[i][j] as product of factorial of frequencies of elements considering blocks i,i+1,i+2,....,j.

For a fixed i, we can compute ans[i][k] \forall k (i \leq k \leq n/B) in one scan across the array.

We will iterate over all i and find all ans[i][j] accordingly.

Also we will maintain a vector for each element containing the counts of its occurences of that element till each block from start (something like prefix count till that block). This will be of size n/B for each distinct element. This will be helpful to calculate count of element x from block l to r which will be useful further.

With this armor, lets try to face the queries :frowning: .

Given an l,r. we can split [l,r] as a disjoint union of atmost n/B consecutive blocks and atmost B prefix elements and atmost B suffix elements.

Now we can use ans array to find product of factorial of frequencies of elements covered by the blocks. And now, we also want to update this answer because some elements are not covered by the blocks. But there are atmost 2*B elements which are outside the blocks and their frequencies need to be updated to get the required answer. So we find frequency of each of these elements in blocks contained in range [l,r] and update answer by multiplying it appropriately considering the count of frequency outside to the blocks. And finally we arrive at the correct value of denominator and then we can compute answer here upon.

TIME COMPLEXITY

For a fixed i, we can compute ans[i][k] \forall k(i \leq k \leq n/B) in one scan across the array which takes O(n) complexity. So total complexity is O(n*(n/B)).

To compute prefix count across blocks for each distinct number we will need O(n/B) time.

Across all the distinct elements, it will take O(n^2/B)

Now for each query, we update answer atmost 2*B times. Hence complexity is O(B) per query.

Hence total complexity is O(n^2/B + q*B).

From given constraints, in worst case, n=q. So lets take n=q to derive best B for worst case.

B=\sqrt{n} minimises (n^2/B + n*B) which can be obtained by differentiating the expression with B and equating to zero.

Hence, taking B as \sqrt{n}

\therefore total complexity is O(n\sqrt{n}).

SOLUTIONS:

Setter's Solution
/** source code by coolreshab **/
 
#include<bits/stdc++.h>
using namespace std;
 
#define MAX 300000
#define BLOCK_SIZE 550
#define inf 1000000000
#define mod 1000000007
#define ll long long
 
int arr[MAX+5],fact[MAX+5],ifact[MAX+5],freq[MAX+5];
int preAns[BLOCK_SIZE+5][BLOCK_SIZE+5];
int preFreq[BLOCK_SIZE+5][MAX+5];
int uniqueArray[MAX+5];
unordered_map<int,int>hashMap;
 
inline int mul(int x,int y,int MOD)
{
    ll z=1LL*x*y;
    if(z>=MOD)
        z%=MOD;
    return z;
}
inline int add(int x,int y,int MOD)
{
    ll z=x+y;
    if(z>=MOD)
        z%=MOD;
    return z;
}
inline int fast_exp(int x,int y)
{
    int res=1;
    while(y)
    {
        if(y&1)
            res=mul(res,x,mod);
        y>>=1;
        x=mul(x,x,mod);
    }
    return res;
}
void pre(int n)
{
    int i;
    fact[0]=1;
    ifact[0]=1;
    for(i=1;i<=n;++i)
    {
        fact[i]=mul(fact[i-1],i,mod);
        ifact[i]=mul(ifact[i-1],fast_exp(i,mod-2),mod);
    }
}
inline int getId(int pos,int blockSize)
{
    int id=pos/blockSize;
    if(pos%blockSize)
        id+=1;
    return id;
}
inline int getPos(int id,int blockSize,int n)
{
    return min(n-1,id*blockSize);
}
void preComputeAns(int n,int blockSize)
{
    int i,j,id,ans;
    for(i=0;i<n;++i)
    {
        freq[arr[i]]++;
        if(i%blockSize==0)
        {
            id=i/blockSize;
            for(j=0;j<n;++j)
                preFreq[id][j]=freq[j];
        }
    }
    id=getId(n-1,blockSize);
    for(i=0;i<n;++i)
        preFreq[id][i]=freq[i];
    for(i=0;i<n;++i)
        freq[i]=0;
    for(i=0;i<n;i+=blockSize)
    {
        id=i/blockSize;
        ans=1;
        for(j=i;j<n;++j)
        {
            ans=mul(ans,fact[freq[arr[j]]],mod);
            freq[arr[j]]++;
            ans=mul(ans,ifact[freq[arr[j]]],mod);
            if(j%blockSize==0)
                preAns[id][j/blockSize]=ans;
        }
        preAns[id][getId(n-1,blockSize)]=ans;
        for(j=0;j<n;++j)
            freq[j]=0;
    }
}
int main()
{
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);
    #ifndef ONLINE_JUDGE
        freopen("/Users/reshab.g/Documents/in.txt","r",stdin);
    #endif
    //freopen("/Users/reshab.g/Documents/out.txt","w",stdout);
     /** source code by coolreshab **/
 
#include<bits/stdc++.h>
using namespace std;
 
#define MAX 300000
#define BLOCK_SIZE 550
#define inf 1000000000
#define mod 1000000007
#define ll long long
 
int arr[MAX+5],fact[MAX+5],ifact[MAX+5],freq[MAX+5];
int preAns[BLOCK_SIZE+5][BLOCK_SIZE+5];
int preFreq[BLOCK_SIZE+5][MAX+5];
int uniqueArray[MAX+5];
unordered_map<int,int>hashMap;
 
inline int mul(int x,int y,int MOD)
{
    ll z=1LL*x*y;
    if(z>=MOD)
        z%=MOD;
    return z;
}
inline int add(int x,int y,int MOD)
{
    ll z=x+y;
    if(z>=MOD)
        z%=MOD;
    return z;
}
inline int fast_exp(int x,int y)
{
    int res=1;
    while(y)
    {
        if(y&1)
            res=mul(res,x,mod);
        y>>=1;
        x=mul(x,x,mod);
    }
    return res;
}
void pre(int n)
{
    int i;
    fact[0]=1;
    ifact[0]=1;
    for(i=1;i<=n;++i)
    {
        fact[i]=mul(fact[i-1],i,mod);
        ifact[i]=mul(ifact[i-1],fast_exp(i,mod-2),mod);
    }
}
inline int getId(int pos,int blockSize)
{
    int id=pos/blockSize;
    if(pos%blockSize)
        id+=1;
    return id;
}
inline int getPos(int id,int blockSize,int n)
{
    return min(n-1,id*blockSize);
}
void preComputeAns(int n,int blockSize)
{
    int i,j,id,ans;
    for(i=0;i<n;++i)
    {
        freq[arr[i]]++;
        if(i%blockSize==0)
        {
            id=i/blockSize;
            for(j=0;j<n;++j)
                preFreq[id][j]=freq[j];
        }
    }
    id=getId(n-1,blockSize);
    for(i=0;i<n;++i)
        preFreq[id][i]=freq[i];
    for(i=0;i<n;++i)
        freq[i]=0;
    for(i=0;i<n;i+=blockSize)
    {
        id=i/blockSize;
        ans=1;
        for(j=i;j<n;++j)
        {
            ans=mul(ans,fact[freq[arr[j]]],mod);
            freq[arr[j]]++;
            ans=mul(ans,ifact[freq[arr[j]]],mod);
            if(j%blockSize==0)
                preAns[id][j/blockSize]=ans;
        }
        preAns[id][getId(n-1,blockSize)]=ans;
        for(j=0;j<n;++j)
            freq[j]=0;
    }
}
int main()
{
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);
    #ifndef ONLINE_JUDGE
        freopen("/Users/reshab.g/Documents/in.txt","r",stdin);
    #endif
    //freopen("/Users/reshab.g/Documents/out.txt","w",stdout);
 
    //begin
    int n,q,i,l,r,L,R,l1,r1,L1,L2,R1,R2,z;
    int blockSize,ans,temp,last,len;
    cin>>n>>q;
    pre(n);
    assert(n>=1 and n<=MAX);
    assert(q>=1 and q<=MAX);
    for(i=0;i<n;++i)
    {
        cin>>arr[i];
        assert(arr[i]>=1 and arr[i]<=inf);
        hashMap[arr[i]];
    }
    i=0;
    for(auto&z:hashMap)
        z.second=i++;
    for(i=0;i<n;++i)
        arr[i]=hashMap[arr[i]];
    blockSize=sqrt(n)+5;
    preComputeAns(n,blockSize);
    last=0;
    while(q-->00)
    {
        cin>>L1>>L2>>R1>>R2;
        assert(L1>=0 and L1<n and L2>=0 and L2<n and R1>=0 and R1<n and R2>=0 and R2<n);
        l=add(mul(L1,last,n),L2,n);
        r=add(mul(R1,last,n),R2,n);
        if(l>r)
            swap(l,r);
        L=l/blockSize;
        R=getId(r,blockSize);
        l1=getPos(L,blockSize,n);
        r1=getPos(R,blockSize,n);
        ans=preAns[L][R];
        len=0;
        for(i=l1;i<l;++i)
        {
            freq[arr[i]]++;
            if(freq[arr[i]]==1)
                uniqueArray[len++]=arr[i];
        }
        for(i=r1;i>r;--i)
        {
            freq[arr[i]]++;
            if(freq[arr[i]]==1)
                uniqueArray[len++]=arr[i];
        }
        for(i=0;i<len;++i)
        {
            z=uniqueArray[i];
            temp=preFreq[R][z]-preFreq[L][z]+(arr[l1]==z);
            ans=mul(ans,fact[temp],mod);
            temp-=freq[z];
            ans=mul(ans,ifact[temp],mod);
            freq[z]=0;
        }
        ans=mul(ans,fact[r-l+1],mod);
        cout<<ans<<"\n";
        last=ans;
    }
    //end
 
    return 0;
} 
    //begin
    int n,q,i,l,r,L,R,l1,r1,L1,L2,R1,R2,z;
    int blockSize,ans,temp,last,len;
    cin>>n>>q;
    pre(n);
    assert(n>=1 and n<=MAX);
    assert(q>=1 and q<=MAX);
    for(i=0;i<n;++i)
    {
        cin>>arr[i];
        assert(arr[i]>=1 and arr[i]<=inf);
        hashMap[arr[i]];
    }
    i=0;
    for(auto&z:hashMap)
        z.second=i++;
    for(i=0;i<n;++i)
        arr[i]=hashMap[arr[i]];
    blockSize=sqrt(n)+5;
    preComputeAns(n,blockSize);
    last=0;
    while(q-->00)
    {
        cin>>L1>>L2>>R1>>R2;
        assert(L1>=0 and L1<n and L2>=0 and L2<n and R1>=0 and R1<n and R2>=0 and R2<n);
        l=add(mul(L1,last,n),L2,n);
        r=add(mul(R1,last,n),R2,n);
        if(l>r)
            swap(l,r);
        L=l/blockSize;
        R=getId(r,blockSize);
        l1=getPos(L,blockSize,n);
        r1=getPos(R,blockSize,n);
        ans=preAns[L][R];
        len=0;
        for(i=l1;i<l;++i)
        {
            freq[arr[i]]++;
            if(freq[arr[i]]==1)
                uniqueArray[len++]=arr[i];
        }
        for(i=r1;i>r;--i)
        {
            freq[arr[i]]++;
            if(freq[arr[i]]==1)
                uniqueArray[len++]=arr[i];
        }
        for(i=0;i<len;++i)
        {
            z=uniqueArray[i];
            temp=preFreq[R][z]-preFreq[L][z]+(arr[l1]==z);
            ans=mul(ans,fact[temp],mod);
            temp-=freq[z];
            ans=mul(ans,ifact[temp],mod);
            freq[z]=0;
        }
        ans=mul(ans,fact[r-l+1],mod);
        cout<<ans<<"\n";
        last=ans;
    }
    //end
 
    return 0;
} 
Tester's Solution
#include <bits/stdc++.h>
#define endl '\n'
 
//#pragma GCC optimize ("O3")
//#pragma GCC target ("sse4")
 
#define SZ(x) ((int)x.size())
#define ALL(V) V.begin(), V.end()
#define L_B lower_bound
#define U_B upper_bound
#define pb push_back
 
using namespace std;
template<class T, class T2> inline int chkmax(T &x, const T2 &y) { return x < y ? x = y, 1 : 0; }
template<class T, class T2> inline int chkmin(T &x, const T2 &y) { return x > y ? x = y, 1 : 0; }
const int MAXN = (int)3e5 + 42;
const int B = 200;
const int mod = (int)1e9 + 7;
 
// Solution with SQRT decomposition. We will split the array into "N / B" buckets and then for every range of buckets we will compute the answer of that interval. This has complexity of O(N * Q).
// Now every query can be answered in O(N / Q) time. With Q ~= 300, this solution should be fast.
 
int fact[MAXN];
int ifact[MAXN];
int i_v[MAXN];
 
int pw(int x, int p)
{
    int ret = 1;
    while(p)
    {
        if(p & 1) ret = ret * 1ll * x % mod;
        x = x * 1ll * x % mod;
        p >>= 1;
    }
 
    return ret;
}
 
void prepare()
{
    fact[0] = 1;
    for(int i = 1; i < MAXN; i++)
        fact[i] = fact[i - 1] * 1ll * i % mod;
 
    ifact[MAXN - 1] = pw(fact[MAXN - 1], mod - 2);
    for(int i = MAXN - 2; i >= 0; i--)
        ifact[i] = ifact[i + 1] * 1ll * (i + 1) % mod;
 
    i_v[0] = 1;
    for(int i = 1; i < MAXN; i++)
        i_v[i] = ifact[i] * 1ll * fact[i - 1] % mod;
}
 
int n, q;
int a[MAXN];
vector<int> sorted;
 
void read()
{
    cin >> n >> q;
    for(int i = 0; i < n; i++)
    {
        cin >> a[i];
        sorted.pb(a[i]);
    }
}
 
int cnt_b = 0;
int cnt[MAXN / B + 2][MAXN];
vector<int> v[MAXN / B + 2];
int answer[MAXN / B][MAXN / B];
 
int tmp[MAXN];
 
int Cnt(int l, int r, int x) { return cnt[r][x] - cnt[l - 1][x]; }
 
int query(int l, int r)
{
    int i = l / B, j = r / B;
    if(j - i <= 1)
    {
        int ret = 1;
        for(int p = l; p <= r; p++)
            ret = ret * 1ll * i_v[++tmp[a[p]]] % mod;
        
        for(int p = l; p <= r; p++)
            tmp[a[p]] = 0;
    
        ret = ret * 1ll * fact[r - l + 1] % mod;
        return ret;
    }
 
    i++, j--;
    int ret = answer[i][j];
    for(int p = l; p < i * B; p++)
        ret = ret * 1ll * i_v[(++tmp[a[p]]) + Cnt(i, j, a[p])] % mod;
    for(int p = (j + 1) * B; p <= r; p++)
        ret = ret * 1ll * i_v[(++tmp[a[p]]) + Cnt(i, j, a[p])] % mod;
    
    for(int p = l; p < i * B; p++) --tmp[a[p]];
    for(int p = (j + 1) * B; p <= r; p++) --tmp[a[p]];
    
    ret = ret * 1ll * fact[r - l + 1] % mod;
    return ret;
}
 
void solve()
{
    prepare();
 
    sort(ALL(sorted));
    sorted.erase(unique(ALL(sorted)), sorted.end());
    for(int i = 0; i < n; i++) a[i] = L_B(ALL(sorted), a[i]) - sorted.begin();
 
    for(int i = 0; i < n; i += B)
    {
        int r = min(n, i + B);
        for(int j = i; j < r; j++)
            v[cnt_b].pb(a[j]);
        cnt_b++;
    }
    
    for(int i = 0; i < cnt_b; i++)
    {
        if(i != 0) 
            for(int j = 0; j < SZ(sorted); j++)
                cnt[i][j] = cnt[i - 1][j];
 
        for(int j: v[i])
            ++cnt[i][j];
    }
 
    for(int i = 0; i < cnt_b; i++)
    {
        memset(tmp, 0, sizeof(tmp));
        for(int j = i; j < cnt_b; j++)
        {
            if(i != j) answer[i][j] = answer[i][j - 1];
            else answer[i][j] = 1;
 
            for(int v: ::v[j])
                answer[i][j] = answer[i][j] * 1ll * i_v[++tmp[v]] % mod;
        }
    }
 
    memset(tmp, 0, sizeof(tmp));
    
    int last_ans = 0;
    while(q--)
    {
        int L1, L2, R1, R2, L, R;
        cin >> L1 >> L2 >> R1 >> R2;
        L = (last_ans * 1ll * L1 + L2) % n;
        R = (last_ans * 1ll * R1 + R2) % n;
        
        if(L > R) swap(L, R);
 
        last_ans = query(L, R);
        cout << last_ans << endl;
    }
}
 
int main()
{
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);
 
    read();
    solve();
    return 0;
}
Editorialist's Solution

Feel free to Share your approach, If it differs. Suggestions are always welcomed. :slight_smile:

11 Likes

It is still unclear to me how you were able to combine the pre-computed answers of the blocks. Can anyone please explain in more detail?

3 Likes

Use scanf and printf.(This may help!)

I want to know which is really more fast … printf() or fast…io… Maybe @l_returns can help :slight_smile:

Fast io is fast ( please use ‘\n’ instead of endl )

1 Like

why there are two main functions in setter’s soln

2 Likes

Tried. That didn’t help. Thanks anyway!
Btw disabling synchronization with stdio and untying input output streams (cin,cout) delivers same (if not better) performance than using scanf , printf (mostly).

Can anyone please help me with this code:: CodeChef: Practical coding for everyone .
I have done exactly what was suggested in tutorial still i am unable to get 100. Please can anyone check the code.

1 Like

Your query same as @bufftowel
I helped him in solving that.

Check this thread.
Basically int tmp[k]={0} is causing TLE as it will be O(k) or more costly.

Thanks a lot. I have been stuck with this question since yesterday. You are a lifesaver​:heart::heart:.

3 Likes

Thanks for your kind words :slight_smile:

1 Like

I’m getting a TLE in this question. I have made all the necessary optimisations. Can anyone help me with it? Here is my code: CodeChef: Practical coding for everyone

Can someone suggest optimizations in this code? Link

This problem can also be solved with segment tree of HashMap .
while building the segment tree store only those values in HashMap whose
frequency is greater than 1 in the whole array.
If an element has frequency 1 then it would not afffect any query.
My AC solution
https://www.codechef.com/viewsolution/24758232

1 Like

Isn’t the time limit 5 secs?
Subtask 13 of your solution takes 5.06…

Although I too solved this using the same technique, this only worked because testcases were weak.
Legitimate approach is by using sqrt decomposition only.

Same thoughts. I had the same idea. But did not submit thinking that testcases will be good :slight_smile:
Got fooled:p

2 Likes

Can anyone help me with this TLE :frowning:https://www.codechef.com/viewsolution/24888703

Java has multiplier - 2x.

I dont get it how you work arround the division.
Somewhere you must devide the denominator, dont you?
But that is not possible since we deal with huge numbers, and MOD.
Can somebody explain?