# GOLOMB - Editorial

Contest

Setter: Shyam Bajaj
Tester: Yash Chandnani
Editorialist: Rajarshi Basu

Easy-Medium

# PREREQUISITES:

Math, Greedy, Binary search

# PROBLEM:

The Golomb sequence G_1, G_2, \ldots is a non-decreasing integer sequence such that for each positive integer n, G_n is the number of occurrences of n in this sequence. The first few elements of G are [1, 2, 2, 3, 3, 4, 4, 4, 5, \ldots]. Do you know the recurrence relation for the Golomb sequence? It is G_1 = 1 and G_{n+1} = 1+G_{n+1-G_{G_n}} for each n \ge 1.

Find the sum of squares of the L-th through R-th term of the Golomb sequence, i.e. S = \sum_{i=L}^R G_i^2. Since the sum can be quite large, compute it modulo 10^9+7.

L \leq R \leq 10^{10}

# EXPLANATION:

Hint 1:

Instead of thinking about the indices, think about the numbers of the array and try to count them instead somehow? That is, maybe count how many time 1 appears, how many times 2 appears and so on, instead of trying to think about the value at the i^{th} position too much.

Hint 2:

There are lots of repeating values right? It seems intuitive that G_{10^{10}} would not be too high right?

Turns out, G_{10^{10}} \approx 2*10^6

Hint 3:

Instead of counting in a range, define solve(X) = \sum\limits_{i=1}^R G_i^2, and then our answer is solve(R) - solve(L-1). Let’s try to implement solve(x).

Hint 4:

Try to count occurances, and then sum of each different value of G_i that we need to take.

Full Solution
• First, notice that there will be certain values till K such that we will be taking all occurrences of values from 1,2,… K, and then we will be taking some of the occurrences of K+1.
• Figure out the number of occurrences of each value till K. Then, take the sum of them. Let number of occurrences of X be Y. The SUM[X] = Y*X*X.
• Now, we want to find SUM[1] + SUM[2] + … SUM[K]. So prefix sums makes sense.
• In all these, note that we have to find the appropriate K such that the total number of occurrences of numbers till K is less than X when we are trying to calculate solve(X).
• Once we find this appropriate K, we can take the sum of the remaining occurrences of K+1 as well, quite easily.
• How to find K? Think! HINT: Think about taking prefix sums of the occurrences of each number and using binary search on that.

For more details, see Setter’s codes attached.

# SOLUTIONS:

Setter's Code
/*
****************************
Author : Shyam Bajaj
****************************
*/

#include<iostream>
#include<algorithm>
#define SIZE 2000000
#define M 1000000007
#define IOS ios_base::sync_with_stdio(false); cin.tie(NULL); cout.tie(NULL);
using namespace std;
typedef long long ll;

ll g[SIZE], prefixSum[SIZE], prefixSum2[SIZE];

ll solve(ll n)
{
ll nthTerm = lower_bound(prefixSum, prefixSum+SIZE, n) - prefixSum;
//cout<<lastTerm<<" ";

ll ans = 0;
if(nthTerm > 0)
ans = prefixSum2[nthTerm-1];

ans = (ans + nthTerm*nthTerm%M * (n-prefixSum[nthTerm-1])%M)%M;
return ans;
}

int main()
{
//IOS;

ll q, l, r;
g[1]=1, g[2]=2, g[3]=2;
for(ll i=3, currPos=4; i<SIZE; i++)
{
for(ll j=0; j<g[i] && currPos<SIZE; j++)
g[currPos++] = i;
}

prefixSum[0] = 0;
for(ll i=1; i<SIZE; i++)
prefixSum[i] = prefixSum[i-1] + g[i];
//cout<<prefixSum[SIZE-1]<<" "; //....greater than 10^10

prefixSum2[0] = 0;
for(ll i=1; i<SIZE; i++)
prefixSum2[i] = (prefixSum2[i-1] + i*i%M*g[i]%M)%M;
cin>>q;

while(q--)
{
cin>>l>>r;
cout<< (solve(r) - solve(l-1) + M) % M << endl;
}

return 0;
}

Tester’s Code
#include <bits/stdc++.h>
using namespace std;

void __print(int x) {cerr << x;}
void __print(long x) {cerr << x;}
void __print(long long x) {cerr << x;}
void __print(unsigned x) {cerr << x;}
void __print(unsigned long x) {cerr << x;}
void __print(unsigned long long x) {cerr << x;}
void __print(float x) {cerr << x;}
void __print(double x) {cerr << x;}
void __print(long double x) {cerr << x;}
void __print(char x) {cerr << '\'' << x << '\'';}
void __print(const char *x) {cerr << '\"' << x << '\"';}
void __print(const string &x) {cerr << '\"' << x << '\"';}
void __print(bool x) {cerr << (x ? "true" : "false");}

template<typename T, typename V>
void __print(const pair<T, V> &x) {cerr << '{'; __print(x.first); cerr << ','; __print(x.second); cerr << '}';}
template<typename T>
void __print(const T &x) {int f = 0; cerr << '{'; for (auto &i: x) cerr << (f++ ? "," : ""), __print(i); cerr << "}";}
void _print() {cerr << "]\n";}
template <typename T, typename... V>
void _print(T t, V... v) {__print(t); if (sizeof...(v)) cerr << ", "; _print(v...);}
#ifndef ONLINE_JUDGE
#define debug(x...) cerr << "[" << #x << "] = ["; _print(x)
#else
#define debug(x...)
#endif

#define rep(i, n)    for(int i = 0; i < (n); ++i)
#define repA(i, a, n)  for(int i = a; i <= (n); ++i)
#define repD(i, a, n)  for(int i = a; i >= (n); --i)
#define trav(a, x) for(auto& a : x)
#define all(x) x.begin(), x.end()
#define sz(x) (int)(x).size()
#define fill(a)  memset(a, 0, sizeof (a))
#define fst first
#define snd second
#define mp make_pair
#define pb push_back
typedef long double ld;
typedef long long ll;
typedef pair<int, int> pii;
typedef vector<int> vi;
int g[2000009];
ll s[2000009];
ll ss[2000009];
const ll mod = 1e9+7;
void pre(){
ll sum = 1;
g[1] = 1;
s[1]=1;
ss[1]=1;
repA(i,2,2000000){
g[i] = 1+g[i-g[g[i-1]]];
sum+=g[i];
s[i] =sum;
ss[i]=(ss[i-1]+1ll*g[i]*i%mod*i%mod)%mod;
}
}

ll solve(ll r){
int lo = 0,hi=2000000;
while(lo<hi){
int m = (lo+hi)/2+1;
if(s[m]>r){
hi = m-1;
}
else lo = m;
}
ll ans = ss[lo];
r-=s[lo];
ans+=r*(lo+1)%mod*(lo+1)%mod;
return ans%mod;
}

int main() {
cin.sync_with_stdio(0); cin.tie(0);
cin.exceptions(cin.failbit);
pre();
int n;cin>>n;
rep(i,n){
ll l,r;cin>>l>>r;
ll ans = solve(r)-solve(l-1)+mod;
cout<<ans%mod<<'\n';
}
return 0;
}


2 Likes

@rajarshi_basu shouldn’t this be SUM[X]=Y∗X∗X bcoz we are asked the sum of squares of the number…?

oops, right. sorry

Can you give an example to understand better?

Suppose we want to find out square sum upto i = 7 in this 1 2 2 3 3 4 4 4 5 5, then here our k will be 3, since we will be taking all the occurrences of the numbers upto 3 and then some occurrence of k + 1 i.e 4.

1 Like

how are we managing count of values ??? from 1 to K?

1 Like

@knightbaron the frequencies indeed form a Golomb sequence that’s the observation.
Freq(1)=1,Freq(2)=2,Freq(3)=2,Freq(4)=3…

1 Like

What is the last prefixSum in this line??

It is similar to

ll nthTerm = lower_bound(prefixSum.begin(), prefixSum.end(), n) - prefixSum.begin();


It returns the index.

1 Like

Ok and Thanks @ameybhavsar nthterm is ll type i.e long long are iterator also allowed to be of type long long

#include
#include
#include
#include
#include
using namespace std;
int main(){
long long t{},maxy{1000000007},maxs{-1},sum{1};
vector result{};
vector sums{1};
vector position{};
cin>>t;
for(long long i{};i<t;i++){
long long l{},r{};
cin>>l>>r;
position.push_back(l);
position.push_back®;}
for(long long i{1};i<position.size();i=i+2){
if(position[i]>maxs)
maxs = position[i];}

    vector <long long> g{0,1};
for(long long j{2};j<=maxs;j++){
long long h{};
h = 1+g[j-g[g[j-1]]];
g.push_back(h);
sum = (sum + g[j]*g[j])%maxy;
sums.push_back(sum);}

for(long long i{0};i<position.size();i=i+2){
if(position[i]!=1)
sum = sums[position[i+1]-1]- sums[position[i]-2];
else
sum = sums[position[i+1]-1];
result.push_back(sum);
}


for(long long c:result)
cout<<c<<endl;

}

is there a pattern in frequencties of numbers ?? If yes then it will be a great help if you mention it clearly

1 occur 1 times
2 , 3 occur 2 times
4, 5 occur 3 times
6,7,8 occur 4 times
9 ,10 ,11 occur 5 times

1 Like

i cannot understand this statement “since the quantity is large compute it with modulo 10e9+7” what does this mean?

The modulo refers to Modular Arithmetic
That means that whenever the value becomes 10^9+7 or larger you keep removing multiples of 10^9+7 until the value is again in the range 0\ldots10^9+6

It is because the final answer can exceed the maximum limit of long long. The 1010 th number in the sequence is greater than 106. So, the maximum answer can be around 1010 * (106) 2 which is above long long’s maximum value. So, you are required to compute the answer mod 109 + 7.

Does anyone got this problem ??? means what is the use of sum[x]=yxx for example g[4]=3;
if query is L=1,R=4 and ans-> (g[1]^2 +g[2]^2 +g[3]^2 +g[4]^2) ??

1 Like

For the idea of this solution see my unofficial editorial and especially a comment I made lower down

1 Like

#include
#include
#include
#include
#include
using namespace std;
int main(){
long long t{},maxy{1000000007},maxs{-1},sum{1};
vector result{};
vector sums{1};
vector position{};
cin>>t;
for(long long i{};i<t;i++){
long long l{},r{};
cin>>l>>r;
position.push_back(l);
position.push_back®;}
for(long long i{1};i<position.size();i=i+2){
if(position[i]>maxs)
maxs = position[i];}

    vector <long long> g{0,1};
for(long long j{2};j<=maxs;j++){
long long h{};
h = 1+g[j-g[g[j-1]]];
g.push_back(h);
sum = (sum + g[j]*g[j])%maxy;
sums.push_back(sum);}

for(long long i{0};i<position.size();i=i+2){
if(position[i]!=1)
sum = sums[position[i+1]-1]- sums[position[i]-2];
else
sum = sums[position[i+1]-1];
result.push_back(sum);}


for(long long c:result)
cout<<c<<endl;

}