DRAG - Editorial


Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4

Author: Dung
Tester: Harris Leung
Editorialist: Nishank Suresh




Segmented Sieve of Eratosthenes, Mo’s algorithm


This task has a lot of parts which initially seem completely disjoint. The first step towards solving it is to look at each one individually, and finally build a working solution.

Maximizing beauty and computing W(B)

First, let’s look at how to maximize the beauty of a single array B = [B_1, B_2, \ldots, B_N].

We need to choose N points, such that (x_i, y_i) lies on the line joining (B_i, 0) and (0, B_i). In other words, x_i + y_i = B_i.

Note that picking points for each B_i is independent, so we can just aim to maximize the beauty of each one separately: that will also maximize their sum.

If we then know the number of ways of maximizing the beauty of each element, W(B) is the product of all these ways.

This gives us a couple of subproblems:

  • How to choose (x_i, y_i) such that x_i + y_i = B_i and the number of integer points on the segment joining (0, 0) and (x_i, y_i) is maximum?
  • How many choices of (x_i, y_i) result in this maximum?

The number of integer points on the segment joining (0, 0) and (x_i, y_i) is simply \gcd(x_i, y_i) + 1. This is a well-known result, and you can find a proof here for example.

So, we would like to maximize \gcd(x_i, y_i) subject to x_i + y_i = B_i. Note that

\gcd(x_i, y_i) = \gcd(x_i+y_i, y_i) = \gcd(B_i, y_i)

So, the answer is a factor of B_i. We’d like this to be as large as possible, so ideally it’d just be B_i. However, that would necessitate either x_i = 0 or y_i = 0, which is not allowed.

Thus, we look at the next highest factor. This is B_i / p, where p is the smallest prime factor of B_i. Indeed, by choosing x_i = B_i / p and y_i = B_i - B_i/p, we obtain a gcd of B/p_i, as required.

How many choices of (x_i, y_i) give us this maximum?
As it turns out, exactly p - 1 choices exist: they correspond to the p-1 points

\left\{\left (\frac{kB_i}{p}, B_i - \frac{kB_i}{p}\right ) : 1 \leq k \leq p-1\right \}


W(B) = \prod_{i=1}^N (p_i - 1)

where p_i is the smallest prime factor of B_i.

Looking at all subsequences

Given an array C, the statement defines S to be the sum of W(B) across all non-empty subsequences B of C. We can use the above formula to compute the answer for a single subsequence. How do we extend this to every subsequence?


For now, let’s also allow the empty subsequence.
Let’s look at the i-th index. There are two choices: it is either chosen in B, or it is not.

  • If it is chosen, it contributes (p_i - 1) to the product W(B)
  • If it is not chosen, it contributes nothing (i.e, 1) to the product W(B)

The two scenarios are independent, and so each index contributes (p_i - 1 + 1) to the sum of W(B) across all B.
Hence, we get

S = \sum_{B \subseteq C} W(B) = \prod_{i=1}^N p_i

This includes the empty subsequence, which contributes a value of 1 in our setup. We don’t want this, so let’s subtract 1 to get the value we actually want, i.e,

S+1 = \prod_{i=1}^N p_i

Answering a single query

We now know how to compute S = \sum W(B) for a given sequence.

Each query, however, asks us to compute the number of distinct values of S across all subsequences of the given range.
Note that this is the same as the number of distinct values of S+1, and that has a nice form (being a product of primes) so let’s focus on that instead.

If we fix a subsequence, S+1 is the product of p_i in the chosen subsequence. In particular, we obtain the prime factorization of S+1.
The number of distinct values of S+1 is thus nothing but the number of distinct prime factorizations we can obtain. So, this is what we will count.

Note that the prime factorization is defined by the number of times each prime is present in it.
So, consider a single prime p. If p occurs k times in this range, there are k+1 choices for p in the prime factorization: it can occur 0, 1, 2, \ldots, k times.

So, if we obtain the frequency (say f_p) of each prime p in the range [L, R], the answer to this query is obtained by simply multiplying the values of (f_p + 1) across all p.

Note that, once again, we need to subtract 1 from the product we obtain, to account for the fact that it is not allowed to choose the empty subsequence.

This gives us a way to answer a single query in \mathcal{O}(N). However, it isn’t quite fast enough yet.

Solving multiple queries

To solve a single query faster than \mathcal{O}(N), we essentially need the following information quickly:

  • Given a range [L, R], find the frequency of each prime in this range.

Querying for a single range from scratch (for example, with a segment tree) is quite hard (and most likely impossible), since all the primes in the range might be distinct which blows up the complexity.

Instead, notice that if you have the frequencies for primes in range [L, R], then obtaining the frequencies for the range [L, R+1] is quite easy: only the frequency of p_{R+1} changes, and its contribution to the product can be updated with just a couple of multiplications.
Similarly, extending one step to the left is easy, as is shrinking one step to the left/right.

This allows us to solve the problem using Mo’s algorithm in \mathcal{O}((N+Q)\sqrt{N}), by simply maintaining the frequencies of all primes, and the current product of (frequency+1).

Computing p_i

Finally, note that all the previous discussion assumed that we already knew the values of p_i, but we still do need to compute these values quickly.

If the A_i values were small (for example, \leq 10^7), finding the smallest prime factor of each one is a classical problem, and can be done in something like 10^7 \log\log 10^7 using a sieve of Eratosthenes.

However, here the A_i values are up to 10^{12}, so this won’t work.

Instead, note that the constraints guarantee that the range of elements is still small, \leq 10^7.

So, we can instead use a segmented sieve to achieve the same result in around the same complexity.

This allows us to compute all the p_i values, following which Mo’s algorithm can be applied as discussed before.


\mathcal{O}(M\log\log M) precomputation followed by \mathcal{O}((N+Q)\sqrt{N}) per test case, where M = 10^7.


Setter's code (C++)
#include <bits/stdc++.h>
using namespace std;

#define int long long 
#define fo(i, a, b) for(int i = a; i <= b; i++)
#define _fo(i, a, b) for(int i = a; i >= b; i--)
#define foa(i, a) for (auto &i : a)
#define sz(a) ((int) a.size())
#define all(a) begin(a), end(a)
#define fi first
#define se second
#define pb(x) push_back(x)
#define mk(x, y) make_pair(x, y)
typedef long long ll;
typedef pair<ll, ll> pll;
typedef vector<ll> vl;
typedef pair<int, int> pii;
typedef vector<int> vi;
const int LOG = 22; 
const int MAX = 1e5+5;
const int MOD = 1e9+7;
const int SQRT = 316;
const ll INF = 1e12;
const ll lon = 1e18;

ll n, q;
ll a[100005];

vector<ll> smp;
ll sieve[10000005];
map<ll, ll> compress;

ll ways = 1;
ll cnt[100005], inv[100005];
vector<pair<pll, ll>> offline[505];
ll ans[100005];

ll add(ll a, ll b) { return (a+b) % MOD; }
ll mul(ll a, ll b) { return (a*b) % MOD; }
ll sub(ll a, ll b) { return (a+MOD-b) % MOD; }

ll pw(ll val, ll temp) {
	ll curr = 1;
	while(temp > 0) {
		if(temp & 1) curr = mul(curr, val);
		val = mul(val, val);
		temp >>= 1;
	return curr;

void precompute() {
	ll low = a[1];
	fo(i, 1, n) low = min(low, a[i]);
	ll lim = 1e6;
	for(ll p = 2; p <= lim; p++) {
		if(sieve[p] == 0) {
			for(ll i = p*p; i <= lim; i += p) sieve[i] = 1;
	fill(sieve+1, sieve+lim+1, 0);
	lim = 1e7;
	foa(p, smp) {
		ll temp = p*(low/p+1);
		for(ll i = temp; i <= low+lim; i += p) {
			if(sieve[i-low] == 0) sieve[i-low] = p;	
	fo(i, 1, n) {
		ll temp = sieve[a[i]-low];
		if(temp == 0) temp = a[i];
		a[i] = temp;
		compress[temp] = 0;
	ll cnt = 0;
	foa(elem, compress) {
		elem.se = cnt;
	fo(i, 1, n) a[i] = compress[a[i]];	
	fo(i, 1, 100001) inv[i] = pw(i, MOD-2);

void update(ll p, ll delta) {
	ways = mul(ways, inv[cnt[p]+1]);
	cnt[p] += delta;
	ways = mul(ways, cnt[p]+1);

ll get(ll l, ll r) {
	ll res = 1;
	fo(i, l, r) update(a[i], 1);
	res = sub(ways, 1);
	fo(i, l, r) update(a[i], -1);
	return res;

void solve() {
	fo(i, 1, 500) sort(all(offline[i]));
	for(ll i = 1; i <= n; i += SQRT) {
		ll block = (i-1) / SQRT, ptr = i+1;
		while(!offline[block].empty()) {
			if(offline[block].back().fi.fi != ptr) {
				update(a[ptr], 1);
			ll l = offline[block].back().fi.fi, r = offline[block].back().fi.se, id = offline[block].back().se;
			fo(j, i+1, r) update(a[j], 1);
			ans[id] = sub(ways, 1);
			fo(j, i+1, r) update(a[j], -1);
		fo(j, ptr, i) update(a[j], -1);

signed main() {
    ios_base::sync_with_stdio(0); cin.tie(0); cout.tie(0);
    cin >> n >> q;
    fo(i, 1, n) cin >> a[i];
    fo(i, 1, q) {
    	ll l, r;
    	cin >> l >> r;
    	if(r-l+1 > SQRT) {
    		ll block = (r-1) / SQRT;
    		offline[block].push_back(mk(mk(l, r), i));
    	else ans[i] = get(l, r);
    fo(i, 1, q) cout << ans[i] << "\n";
Tester's code (C++)
using namespace std;
typedef long long ll;
#define fi first
#define se second
const ll mod=1e9+7;
// -------------------- Input Checker Start --------------------
long long readInt(long long l, long long r, char endd)
    long long x = 0;
    int cnt = 0, fi = -1;
    bool is_neg = false;
        char g = getchar();
        if(g == '-')
            assert(fi == -1);
            is_neg = true;
        if('0' <= g && g <= '9')
            x *= 10;
            x += g - '0';
            if(cnt == 0)
                fi = g - '0';
            assert(fi != 0 || cnt == 1);
            assert(fi != 0 || is_neg == false);
            assert(!(cnt > 19 || (cnt == 19 && fi > 1)));
        else if(g == endd)
                x = -x;
            if(!(l <= x && x <= r))
                cerr << l << ' ' << r << ' ' << x << '\n';
            return x;
string readString(int l, int r, char endd)
    string ret = "";
    int cnt = 0;
        char g = getchar();
        assert(g != -1);
        if(g == endd)
        ret += g;
    assert(l <= cnt && cnt <= r);
    return ret;
long long readIntSp(long long l, long long r) { return readInt(l, r, ' '); }
long long readIntLn(long long l, long long r) { return readInt(l, r, '\n'); }
string readStringLn(int l, int r) { return readString(l, r, '\n'); }
string readStringSp(int l, int r) { return readString(l, r, ' '); }
void readEOF() { assert(getchar() == EOF); }
vector<int> readVectorInt(int n, long long l, long long r)
    vector<int> a(n);
    for(int i = 0; i < n - 1; i++)
        a[i] = readIntSp(l, r);
    a[n - 1] = readIntLn(l, r);
    return a;
// -------------------- Input Checker End --------------------
const int N=1e5+5;
int n,q;
ll sp[10000001];
bool die[1000001];
ll mn,mx;
ll a[N];
ll inv[N];
int um[N];
ll pw(ll x,ll y){
	if(y==0) return 1;
	if(y%2) return x*pw(x,y-1)%mod;
	ll res=pw(x,y/2);
	return res*res%mod;
ll ans[N];
const int bs=300;
bool cmp(pair<pair<int,int>,int> x,pair<pair<int,int>,int> y){
	if(x.fi.fi/bs!=y.fi.fi/bs) return x.fi.fi/bs<y.fi.fi/bs;
	else return x.fi.se<y.fi.se;
ll res=1;
void add(int id){
void del(int id){
int main(){
	n=readInt(1,1e5,' ');q=readInt(1,1e5,'\n');
	for(int i=1; i<=n ;i++){
		a[i]=readInt(1,1e12,(i==n)?'\n':' ');
	assert(mx-mn>=1 && mx-mn<1e7);
	for(ll i=2; i*i<=mx ;i++){
		if(die[i]) continue;
		for(ll j=min((ll)1e9,i*i); j*j<=mx ;j+=i) die[j]=true;
		ll st=(mn+i-1)/i*i-mn;
		for(int j=st; j<=mx-mn ;j+=i){
			if(sp[j]==0) sp[j]=i;
	int ptr=0;
	for(int i=1; i<=n ;i++){
		if(sp[a[i]-mn]!=0) a[i]=sp[a[i]-mn];
		if(mp[a[i]]==0) mp[a[i]]=++ptr;
		//cout << a[i] << ' ' << inv[i] << endl;
	for(int i=1; i<=q ;i++){
		int l,r;
		l=readInt(1,n,' ');
	int l=1,r=0;
	for(int i=1; i<=q ;i++){
		int tl=qry[i].fi.fi,tr=qry[i].fi.se;
		while(r<tr) add(++r);
		while(l>tl) add(--l);
		while(l<tl) del(l++);
		while(r>tr) del(r--);
	for(int i=1; i<=q ;i++){
		cout << ans[i] << '\n';