# PROBLEM LINK:

**Setter:** Kasra Mazaheri

**Tester:** Arshia Soltani

**Editorialist:** Kasra Mazaheri

# DIFFICULTY:

Medium

# PREREQUISITES:

Dynamic Programming

# PROBLEM:

You are given an array A of length N and an integer K. We call a subset of elements of A good, if their product is divisible by K. We define the cost of such subset to be the sum of all it’s elements. You are given Q queries of form (L_i,R_i), for each of which you should output the minimum cost of a good subset of the array A, if we deleted all the indices in the interval [L_i,R_i]. In case there’s no good subset for a given query, output -1.

# QUICK EXPLANATION

- Let K = \prod\limits_{i = 1}^m {p_i^{\alpha_i}} be the factorization of K where p_i are distinct prime numbers and 0 < \alpha_i. Then a number T is a multiple of K, if and only if for each p_i it has at least \alpha_i of them. So we’re only interested in these prime numbers and their relative count in the interval [0, \alpha_i].
- Define a state to represent how many of each p_i we have. One can see that each important state corresponds to a divisor of K since we should only keep track of the primes that K has (p_i) and at most the amount that K has of each in it’s factorization (\alpha_i).
- If our current subset’s state corresponds to d, a divisor of K, then after adding an element a to it, it would correspond to gcd(d \times a, K).
- We can define dp_{i,d} to be the minimum cost of a subset of the first i elements that corresponds to d. Then dp_{i,d} = min(dp_{i-1,d}, dp_{i-1,j} + A_i) for all such j that gcd(j \times A_i, K) = d. Similarly define pd_{i,d} to be the minimum cost of a subset of the elements in the interval [i, N] that corresponds to d. Then we can retrieve the answer for a query (L_i,R_i) using the data we calculated in dp_{L_i-1} and pd_{R_i+1}.

# EXPLANATION

For now we only focus on calculating the minimum cost of a good subset of the whole array.

It can be easily proven that a number T is divisible by K = \prod\limits_{i = 1}^m {p_i^{\alpha_i}} if and only if for all 1 \le i \le m, T has at least \alpha_i instances of p_i. Thus only these primes matter as other prime numbers won’t play a part in T being divisible by K. So we can ignore any prime number other than p_i in each of the A_i. Also if at some point we had more than \alpha_i of p_i, we can assume that we only have \alpha_i of it since adding more p_i is useless.

Define the state of a subset to be an array S of length m, meaning that we have picked S_i of p_i where 0 \le S_i \le \alpha_i for all 1 \le i \le m. One can easily see from the above observations that d = \prod\limits_{i = 1}^m {p_i^{S_i}} is always a divisor of K. So we can map each state to a divisor of K and vice versa. Now we can see that adding an element a to a subset with state of d, changes it’s state to gcd(d \times a, K). This is because of the fact that gcd gets rid of all the extra primes, those other than p_i and those p_i that we have more than \alpha_i of them. We can now solve this problem using a dynamic programming solution :

Define dp_{i,d} to be the minimum cost of a subset of the first i elements that their state equals d. Then we can write dp_{i,d} = min(dp_{i-1,d}, dp_{i-1,j} + A_i) for all such j that gcd(j \times A_i, K) = d. Then the minimum cost of a good subset of the whole array is dp_{N,K}. To do this efficiently, we can loop over j and calculate the value of d from it. This way we only loop over each divisor of K once for every element of A we add, so the time complexity of this part will be O(N \times D(K)) where D(K) is the number of divisors of K.

Now back to the original problem. In order to answer the queries, we need another helping function. Define pd_{i,d} to be the minimum cost of a subset of the elements in the interval [i, N] such that their state equals to d. pd can be updated similarly as dp. Now the answer to a query (L_i,R_i) is min(dp_{L_i-1,d_1} + pd_{R_i+1,d_2}) for all such d_1 and d_2 that d_1 \times d_2 is multiple of K. However, this would not fit in the time limit. We can improve this solution by modifying the definition of dp_{i,d} to be the minimum cost of a subset of the first i elements such that their state is a \textbf{multiple} of d. Now the answer to a query (L_i,R_i) is min(dp_{L_i-1,d} + pd_{R_i+1,\frac{K}{d}}) for all d. This improved version is now fast enough to fit in the limit. The only remaining problem is how to calculate the new value of dp_{i,d}. We can compute the array dp in the aforementioned way and then do dp_{i,d} = min(dp_{i,d}, dp_{i,gcd(d \times p_j, K)}) for 1 \le j \le m. It’s worth noting that we should do this in the decreasing order of d. This extra step costs us O(N \times D(K) \times P(K)) where P(K) is the number of distinct primes in factorization of K. And so we arrive at the final solution.

For implementation purposes, it’s recommended to precompute the value of gcd(d_1 \times d_2, K) for every d_1 and d_2 in order to avoid the extra O(log(K)) factor caused by gcd.

# TIME COMPLEXITY

The time complexity is O(D(K) \times (N \times P(K) + Q)).

# SOLUTIONS:

## Setter's Solution

```
// In The Name Of The Queen
#include<bits/stdc++.h>
#pragma GCC optimzie("O3");
using namespace std;
const int N = 100005, K = 779;
int n, q, k, d, A[N], Id[K][K];
uint32_t dp[N][K], pd[N][K];
vector < int > D, P;
inline void SMin(uint32_t &a, uint32_t b) {a = min(a, b);}
int main()
{
scanf("%d%d%d", &n, &q, &k);
for (int i = 1; i <= n; i ++)
scanf("%d", &A[i]);
for (int i = 1; i * i <= k; i ++)
if (k % i == 0)
{
D.push_back(i);
if (i < k / i)
D.push_back(k / i);
}
sort(D.begin(), D.end());
d = (int)D.size();
for (int i = 0; i < d; i ++)
for (int j = 0; j < d; j ++)
Id[i][j] = (int)(lower_bound(D.begin(), D.end(), __gcd(1LL * D[i] * D[j], 1LL * k)) - D.begin());
for (int i = 1; i < d; i ++)
{
bool Fail = 0;
for (int j = 1; j < i; j ++)
if (D[i] % D[j] == 0)
Fail = 1;
if (!Fail)
P.push_back(i);
}
for (int j = 1; j < d; j ++)
dp[0][j] = (uint32_t)(3e9);
for (int i = 1; i <= n; i ++)
{
int id = (int)(lower_bound(D.begin(), D.end(), __gcd(A[i], k)) - D.begin());
for (int j = 0; j < d; j ++)
dp[i][j] = dp[i - 1][j];
for (int j = 0; j < d; j ++)
SMin(dp[i][Id[j][id]], dp[i - 1][j] + A[i]);
}
for (int j = 1; j < d; j ++)
pd[n + 1][j] = (uint32_t)(3e9);
for (int i = n; i; i --)
{
int id = (int)(lower_bound(D.begin(), D.end(), __gcd(A[i], k)) - D.begin());
for (int j = 0; j < d; j ++)
pd[i][j] = pd[i + 1][j];
for (int j = 0; j < d; j ++)
SMin(pd[i][Id[j][id]], pd[i + 1][j] + A[i]);
}
for (int i = 1; i <= n; i ++)
for (int j = d - 1; ~ j; j --)
for (int p : P)
SMin(dp[i][j], dp[i][Id[j][p]]),
SMin(pd[i][j], pd[i][Id[j][p]]);
for (; q; q --)
{
int l, r;
long long Mn = dp[0][1];
scanf("%d%d", &l, &r);
for (int j = 0; j < d; j ++)
Mn = min(Mn, (long long)dp[l - 1][j] + pd[r + 1][d - j - 1]);
if (Mn >= (long long)(3e9))
Mn = -1;
printf("%lld\n", Mn);
}
return 0;
}
```

## Tester's Solution

```
#include <algorithm>
#include <bitset>
#include <complex>
#include <deque>
#include <exception>
#include <fstream>
#include <functional>
#include <iomanip>
#include <ios>
#include <iosfwd>
#include <iostream>
#include <istream>
#include <iterator>
#include <limits>
#include <list>
#include <locale>
#include <map>
#include <memory>
#include <new>
#include <numeric>
#include <ostream>
#include <queue>
#include <set>
#include <sstream>
#include <stack>
#include <stdexcept>
#include <streambuf>
#include <string>
#include <typeinfo>
#include <utility>
#include <valarray>
#include <vector>
#if __cplusplus >= 201103L
#include <array>
#include <atomic>
#include <chrono>
#include <condition_variable>
#include <forward_list>
#include <future>
#include <initializer_list>
#include <mutex>
#include <random>
#include <ratio>
#include <regex>
#include <scoped_allocator>
#include <system_error>
#include <thread>
#include <tuple>
#include <typeindex>
#include <type_traits>
#include <unordered_map>
#include <unordered_set>
#endif
int gcd(int a, int b) {return b == 0 ? a : gcd(b, a % b);}
#define ll long long
#define pb push_back
#define ld long double
#define mp make_pair
#define F first
#define S second
#define pii pair<ll,ll>
using namespace :: std;
const ll maxn=1e5+500;
const ll maxt=880;
const ll maxQ=1000;
const ll inf=1e10;
ll a[maxn];
bool useL[maxn];
bool useR[maxn];
ll Rv[maxn];
vector<ll> magh;
ll b[maxt][maxt];
ll dp[maxt];
ll ans[maxQ][maxQ];
ll t;
bool add(ll ind){
ll x=a[ind];
ll rv=Rv[ind];
bool u=0;
for(ll j=maxt-1;j>=0;j--){
if(dp[b[x][j]]>dp[j]+rv){
u=1;
dp[b[j][x]]=dp[j]+rv;
}
}
return u;
}
int main(){
ios_base::sync_with_stdio(0);cin.tie(0);cout.tie(0);
ll n,q,k;
cin>>n>>q>>k;
for(ll i=1;i*i<=k;i++){
if(k%i==0){
magh.pb(i);
if(k/i!=i){
magh.pb(k/i);
}
}
}
sort(magh.begin(),magh.end());
for(ll i=0;i<n;i++){
cin>>a[i];
Rv[i]=a[i];
a[i]=gcd(a[i],k);
a[i]=lower_bound(magh.begin(),magh.end(),a[i])-magh.begin();
}
t=magh.size();
for(ll i=0;i<t;i++){
for(ll j=0;j<t;j++){
b[i][j]=lower_bound(magh.begin(),magh.end(),gcd(k,magh[i]*magh[j]))-magh.begin();
}
}
memset(dp,30,sizeof dp);
dp[0]=0;
for(ll i=0;i<n;i++){
useL[i]=add(i);
}
memset(dp,30,sizeof dp);
dp[0]=0;
for(ll i=n-1;i>=0;i--){
useR[i]=add(i);
}
vector<ll> vecL,vecR;
for(ll i=0;i<n;i++){
if(useL[i])vecL.pb(i);
if(useR[i])vecR.pb(i);
}
memset(ans,31,sizeof ans);
for(ll i=-1;i<(ll)vecL.size();i++){
memset(dp,30,sizeof dp);
dp[0]=0;
for(ll j=0;j<=i;j++){
add(vecL[j]);
}
for(ll j=(ll)vecR.size();j>=-1;j--){
// ans[i+1][j+1];
if(j>=0 && j<(ll)vecR.size()){
add(vecR[j]);
}
ans[i+1][j+1]=dp[t-1];
}
}
for(ll i=0;i<maxQ;i++){
for(ll j=0;j<maxQ;j++){
if(ans[i][j]>inf)ans[i][j]=-1;
}
}
for(ll i=0;i<q;i++){
ll l,r;
cin>>l>>r;
l--;
r--;
l--;
r++;
l=upper_bound(vecL.begin(),vecL.end(),l)-vecL.begin()-1;
r=lower_bound(vecR.begin(),vecR.end(),r)-vecR.begin();
cout<<ans[l+1][r+1]<<endl;
}
}
```

Feel free to share your approach. As usual, suggestions are warmly welcomed.