GOLOMB - Editorial

rajarshi_basu · July 25, 2020, 5:26pm

PROBLEM LINK:

Contest

Setter: Shyam Bajaj
Tester: Yash Chandnani
Editorialist: Rajarshi Basu

DIFFICULTY:

Easy-Medium

PREREQUISITES:

Math, Greedy, Binary search

PROBLEM:

The Golomb sequence G_1, G_2, \ldots is a non-decreasing integer sequence such that for each positive integer n, G_n is the number of occurrences of n in this sequence. The first few elements of G are [1, 2, 2, 3, 3, 4, 4, 4, 5, \ldots]. Do you know the recurrence relation for the Golomb sequence? It is G_1 = 1 and G_{n+1} = 1+G_{n+1-G_{G_n}} for each n \ge 1.

Find the sum of squares of the L-th through R-th term of the Golomb sequence, i.e. S = \sum_{i=L}^R G_i^2. Since the sum can be quite large, compute it modulo 10^9+7.

L \leq R \leq 10^{10}

EXPLANATION:

Hint 1:

Instead of thinking about the indices, think about the numbers of the array and try to count them instead somehow? That is, maybe count how many time 1 appears, how many times 2 appears and so on, instead of trying to think about the value at the i^{th} position too much.

Hint 2:

There are lots of repeating values right? It seems intuitive that G_{10^{10}} would not be too high right?

Answer:

Turns out, G_{10^{10}} \approx 2*10^6

Hint 3:

Instead of counting in a range, define solve(X) = \sum\limits_{i=1}^R G_i^2, and then our answer is solve(R) - solve(L-1). Let’s try to implement solve(x).

Hint 4:

Try to count occurances, and then sum of each different value of G_i that we need to take.

Full Solution

First, notice that there will be certain values till K such that we will be taking all occurrences of values from 1,2,… K, and then we will be taking some of the occurrences of K+1.

Figure out the number of occurrences of each value till K. Then, take the sum of them. Let number of occurrences of X be Y. The SUM[X] = Y*X*X.

Now, we want to find SUM[1] + SUM[2] + … SUM[K]. So prefix sums makes sense.

In all these, note that we have to find the appropriate K such that the total number of occurrences of numbers till K is less than X when we are trying to calculate solve(X).

Once we find this appropriate K, we can take the sum of the remaining occurrences of K+1 as well, quite easily.

How to find K? Think! HINT: Think about taking prefix sums of the occurrences of each number and using binary search on that.

For more details, see Setter’s codes attached.

SOLUTIONS:

Setter's Code

/*
****************************
    Author : Shyam Bajaj
****************************
*/
 
#include<iostream>
#include<algorithm>
#define SIZE 2000000
#define M 1000000007
#define IOS ios_base::sync_with_stdio(false); cin.tie(NULL); cout.tie(NULL);
using namespace std;
typedef long long ll;
 
ll g[SIZE], prefixSum[SIZE], prefixSum2[SIZE];
 
ll solve(ll n)
{
	ll nthTerm = lower_bound(prefixSum, prefixSum+SIZE, n) - prefixSum;
	//cout<<lastTerm<<" ";
	
	ll ans = 0;
	if(nthTerm > 0)
		ans = prefixSum2[nthTerm-1];
	
	ans = (ans + nthTerm*nthTerm%M * (n-prefixSum[nthTerm-1])%M)%M;
	return ans;
}
 
int main()
{
	//IOS;
	
	ll q, l, r;
	g[1]=1, g[2]=2, g[3]=2;
	for(ll i=3, currPos=4; i<SIZE; i++)
	{
		for(ll j=0; j<g[i] && currPos<SIZE; j++)
			g[currPos++] = i;
	}
	
	prefixSum[0] = 0;
	for(ll i=1; i<SIZE; i++)
		prefixSum[i] = prefixSum[i-1] + g[i];
	//cout<<prefixSum[SIZE-1]<<" "; //....greater than 10^10
	
	prefixSum2[0] = 0;
	for(ll i=1; i<SIZE; i++)
		prefixSum2[i] = (prefixSum2[i-1] + i*i%M*g[i]%M)%M;
	cin>>q;
	
	while(q--)
	{
		cin>>l>>r;
		cout<< (solve(r) - solve(l-1) + M) % M << endl;
	}
	
	return 0;
}

Tester’s Code

#include <bits/stdc++.h>
using namespace std;
 
 
void __print(int x) {cerr << x;}
void __print(long x) {cerr << x;}
void __print(long long x) {cerr << x;}
void __print(unsigned x) {cerr << x;}
void __print(unsigned long x) {cerr << x;}
void __print(unsigned long long x) {cerr << x;}
void __print(float x) {cerr << x;}
void __print(double x) {cerr << x;}
void __print(long double x) {cerr << x;}
void __print(char x) {cerr << '\'' << x << '\'';}
void __print(const char *x) {cerr << '\"' << x << '\"';}
void __print(const string &x) {cerr << '\"' << x << '\"';}
void __print(bool x) {cerr << (x ? "true" : "false");}
 
template<typename T, typename V>
void __print(const pair<T, V> &x) {cerr << '{'; __print(x.first); cerr << ','; __print(x.second); cerr << '}';}
template<typename T>
void __print(const T &x) {int f = 0; cerr << '{'; for (auto &i: x) cerr << (f++ ? "," : ""), __print(i); cerr << "}";}
void _print() {cerr << "]\n";}
template <typename T, typename... V>
void _print(T t, V... v) {__print(t); if (sizeof...(v)) cerr << ", "; _print(v...);}
#ifndef ONLINE_JUDGE
#define debug(x...) cerr << "[" << #x << "] = ["; _print(x)
#else
#define debug(x...)
#endif
 
#define rep(i, n)    for(int i = 0; i < (n); ++i)
#define repA(i, a, n)  for(int i = a; i <= (n); ++i)
#define repD(i, a, n)  for(int i = a; i >= (n); --i)
#define trav(a, x) for(auto& a : x)
#define all(x) x.begin(), x.end()
#define sz(x) (int)(x).size()
#define fill(a)  memset(a, 0, sizeof (a))
#define fst first
#define snd second
#define mp make_pair
#define pb push_back
typedef long double ld;
typedef long long ll;
typedef pair<int, int> pii;
typedef vector<int> vi;
int g[2000009];
ll s[2000009];
ll ss[2000009];
const ll mod = 1e9+7;
void pre(){
	ll sum = 1;
	g[1] = 1;
	s[1]=1;
	ss[1]=1;
	repA(i,2,2000000){
		g[i] = 1+g[i-g[g[i-1]]];
		sum+=g[i];
		s[i] =sum;
		ss[i]=(ss[i-1]+1ll*g[i]*i%mod*i%mod)%mod;
	}
}
 
ll solve(ll r){
	int lo = 0,hi=2000000;
	while(lo<hi){
		int m = (lo+hi)/2+1;
		if(s[m]>r){
			hi = m-1;
		}
		else lo = m;
	}
	ll ans = ss[lo];
	r-=s[lo];
	ans+=r*(lo+1)%mod*(lo+1)%mod;
	return ans%mod;
}
 
int main() {
	cin.sync_with_stdio(0); cin.tie(0);
	cin.exceptions(cin.failbit);
	pre();
	int n;cin>>n;
	rep(i,n){
		ll l,r;cin>>l>>r;
		ll ans = solve(r)-solve(l-1)+mod;
		cout<<ans%mod<<'\n';
	}
	return 0;
}

pranay_garg · July 26, 2020, 8:50am

@rajarshi_basu shouldn’t this be SUM[X]=Y∗X∗X bcoz we are asked the sum of squares of the number…?

rajarshi_basu · July 26, 2020, 9:06am

oops, right. sorry

ameybhavsar · July 26, 2020, 9:08am

Can you give an example to understand better?

anon70677113 · July 26, 2020, 9:30am

Suppose we want to find out square sum upto i = 7 in this 1 2 2 3 3 4 4 4 5 5, then here our k will be 3, since we will be taking all the occurrences of the numbers upto 3 and then some occurrence of k + 1 i.e 4.

knightbaron · July 26, 2020, 9:45am

how are we managing count of values ??? from 1 to K?

spandan_35 · July 26, 2020, 10:06am

@knightbaron the frequencies indeed form a Golomb sequence that’s the observation.
Freq(1)=1,Freq(2)=2,Freq(3)=2,Freq(4)=3…

dhruv788 · July 26, 2020, 10:07am

What is the last prefixSum in this line??

ameybhavsar · July 26, 2020, 10:09am

It is similar to

ll nthTerm = lower_bound(prefixSum.begin(), prefixSum.end(), n) - prefixSum.begin();

It returns the index.

dhruv788 · July 26, 2020, 10:10am

Ok and Thanks @ameybhavsar nthterm is ll type i.e long long are iterator also allowed to be of type long long

g_himanshu0100 · July 26, 2020, 10:32am

please help,shows time limit exceeded:-

#include
#include
#include
#include
#include
using namespace std;
int main(){
long long t{},maxy{1000000007},maxs{-1},sum{1};
vector result{};
vector sums{1};
vector position{};
cin>>t;
for(long long i{};i<t;i++){
long long l{},r{};
cin>>l>>r;
position.push_back(l);
position.push_back®;}
for(long long i{1};i<position.size();i=i+2){
if(position[i]>maxs)
maxs = position[i];}

    vector <long long> g{0,1};
    for(long long j{2};j<=maxs;j++){
        long long h{};
        h = 1+g[j-g[g[j-1]]];
        g.push_back(h);
        sum = (sum + g[j]*g[j])%maxy;
        sums.push_back(sum);}
        
    for(long long i{0};i<position.size();i=i+2){
        if(position[i]!=1)
            sum = sums[position[i+1]-1]- sums[position[i]-2];
        else
            sum = sums[position[i+1]-1];
        result.push_back(sum);
    }

for(long long c:result)
cout<<c<<endl;

}

knightbaron · July 26, 2020, 10:46am

is there a pattern in frequencties of numbers ?? If yes then it will be a great help if you mention it clearly

imakshay_31 · July 26, 2020, 10:55am

1 occur 1 times
2 , 3 occur 2 times
4, 5 occur 3 times
6,7,8 occur 4 times
9 ,10 ,11 occur 5 times

arpit92_8 · July 26, 2020, 1:40pm

i cannot understand this statement “since the quantity is large compute it with modulo 10e9+7” what does this mean?

spaanse · July 26, 2020, 2:12pm

The modulo refers to Modular Arithmetic
That means that whenever the value becomes 10^9+7 or larger you keep removing multiples of 10^9+7 until the value is again in the range 0\ldots10^9+6

jay_1048576 · July 26, 2020, 2:16pm

It is because the final answer can exceed the maximum limit of long long. The 10¹⁰ th number in the sequence is greater than 10⁶. So, the maximum answer can be around 10¹⁰ * (10⁶) ² which is above long long’s maximum value. So, you are required to compute the answer mod 10⁹ + 7.

swapno7064 · July 26, 2020, 2:21pm

Does anyone got this problem ??? means what is the use of sum[x]=yxx for example g[4]=3;
if query is L=1,R=4 and ans-> (g[1]^2 +g[2]^2 +g[3]^2 +g[4]^2) ??
please explain with example??..thanks in advance…

ads191712 · July 26, 2020, 3:34pm

I can’t understand the main idea here. Please help me. Thanx in advance

spaanse · July 26, 2020, 4:13pm

For the idea of this solution see my unofficial editorial and especially a comment I made lower down

g_himanshu0100 · July 26, 2020, 7:30pm

please help shows time limit exceeded:-

#include
#include
#include
#include
#include
using namespace std;
int main(){
long long t{},maxy{1000000007},maxs{-1},sum{1};
vector result{};
vector sums{1};
vector position{};
cin>>t;
for(long long i{};i<t;i++){
long long l{},r{};
cin>>l>>r;
position.push_back(l);
position.push_back®;}
for(long long i{1};i<position.size();i=i+2){
if(position[i]>maxs)
maxs = position[i];}

    vector <long long> g{0,1};
    for(long long j{2};j<=maxs;j++){
        long long h{};
        h = 1+g[j-g[g[j-1]]];
        g.push_back(h);
        sum = (sum + g[j]*g[j])%maxy;
        sums.push_back(sum);}
        
    for(long long i{0};i<position.size();i=i+2){
        if(position[i]!=1)
            sum = sums[position[i+1]-1]- sums[position[i]-2];
        else
            sum = sums[position[i+1]-1];
        result.push_back(sum);}

for(long long c:result)
cout<<c<<endl;

}