PROBLEM LINK:
Practice
Contest: Division 1
Contest: Division 2
Contest: Division 3
Contest: Division 4
Author: kingmessi
Tester: watoac2001
Editorialist: iceknight1093
DIFFICULTY:
TBD
PREREQUISITES:
Sieve of Eratosthenes, observation
PROBLEM:
You have an array A. You can append at most K occurrences of some integer between 1 and M to it.
Find the maximum possible value of g(A) for this array, where:
- g(A) equals the sum of A minus the maximum value of f(A_i\cdot A_j) across all 1 \leq i \lt j \leq |A|.
- f(x) denotes the number of divisors of x which have only a single distinct prime factor.
EXPLANATION:
Thereâs a lot to unpack here, so letâs go bit by bit.
Computing f(x)
f(x) is the number of divisors of x that have only a single distinct prime dividing them; in other words, the number of divisors of x that are prime powers (except 1).
In fact, this is also just the number of primes in the prime factorization of x (not distinct primes - we do consider repeats).
Why?
Looking at the prime factorization gives this observation for free.
Let
Then, the number of non-1 prime powers dividing x is exactly a_1 + a_2 + \ldots + a_k, since there are a_1 distinct powers of p_1, a_2 distinct powers of p_2, and so on.
But this is also just the number of primes in the prime factorization of x, as earlier claimed.
This interpretation also gives us a rather nice property: f(x\cdot y) = f(x) + f(y), since the primes in xy are just all the primes in x and y together.
Further, since weâre dealing with only values \leq 2\cdot 10^6, we can easily precompute f(x) for all x from 1 to 2\cdot 10^6 using a sieve.
Computing g(A)
As we noted above, f(A_i\cdot A_j) = f(A_i) + f(A_j).
So, the maximum value of f(A_i\cdot A_j) simply equals the sum of the two largest f(A_i) values in the array.
That is, g(A) equals \sum A_i - M_1 - M_2, where M_1 is the maximum f(A_i) and M_2 equals the second maximum f(A_i) (M_1 can equal M_2 as long as they come from different indices).
Inserting elements
Since f(x) is the number of primes dividing x including multiplicity, and weâre dealing with individual elements that are \leq M, weâll always have f(x) \leq \log_2 M.
For M \leq 2\cdot 10^6 this is about 21, for reference, i.e pretty small.
Now, observe that since we want to maximize g(A), itâs never optimal (or rather, never only optimal) to insert âsmallâ elements to A.
That is, if x \leq M - \log_2 M, we could insert x+\log_2 M instead, and it wonât reduce g(A).
This means we only need to consider the elements M, M-1, \ldots, M - \log_2M as candidates for insertion.
In fact, we can say even more: there exists an optimal solution such that we insert a single value K times!
Proof
Suppose we insert two distinct elements.
Let x_1 \lt x_2 be two distinct elements we inserted, with the largest f(x) values among all of them.
Recall that M_1 and M_2 are the two maximum f-values of the original array.
If f(x_1) \geq f(x_2), itâs trivially better to replace every x_1 with x_2: the sum increases and the maximums donât increase (and might even decrease).
So, we work with f(x_1) \leq f(x_2) from now on.
Note that if f(x_2) \leq M_1, itâs always not worse to replace every occurrence of x_1 with x_2, since the sum increases and the maximums donât change.
This leaves us with the case f(x_2) \gt M_1.
If there are already \geq 2 occurrences of x_2, we can again turn every occurrence of x_1 into x_2, which improves the sum but doesnât change the maximums.
Now, let there be only a single occurrence of x_2.
Note that:
- Two occurrences of x_2 will have a contribution of 2x_2 - 2f(x_2) to the score.
- One x_1 and one x_2 will have a contribution of x_1 + x_2 - f(x_2) - \max(M_1, f(x_1)) to the score.
We rewrite this as 2x_2 - (x_2 - x_1) - f(x_2) - \max(M_1, f(x_1)). - Two occurrences of x_1 will contribute 2x_1 - \max(M_1, f(x_1)) - \max(M_2, f(x_1)) to the score.
Again, we rewrite this as 2x_2 - 2(x_2 - x_1) - \max(M_1, f(x_1)) - \max(M_2, f(x_1)).
Our claim is that the second expression can never be strictly greater than the other two.
This can be proved via contradiction.
Suppose it were greater than both. Then, we have
and
Putting both conclusions together tells us that \max(M_1, f(x_2)) \lt \max(M_2, f(x_2)), which is clearly impossible since M_1\geq M_2.
So, itâs either better to have two occurrences of x_2, or two occurrences of x_1.
Either way, we can repeatedly convert one of them to the other till itâs no longer possible to do so.
As a result, weâve reduced the number of distinct elements inserted by 1, while not reducing the score of the array.
Repeat this process till only a single distinct element remains.
This gives us a pretty simple-to-implement algorithm: for each x between M and M - \log_2 M, compute the score when itâs inserted K times, and take the best answer among them all.
Computing the score for a fixed x can be done in constant time, since:
- The sum of the array is just the sum of the original A, plus K times x.
- If M_1 and M_2 are the original two maximums, the new maximums will be \max(M_1, f(x)) and \max(M_2, f(x)).
This can be seen by analyzing three cases: when f(x) \leq M_2, when itâs between M_2 and M_1, and when itâs \gt M_1.
If K = 1 there would be an edgecase with it here, but the constraints guarantee K\geq 2.
TIME COMPLEXITY:
\mathcal{O}(N + \log M) per testcase, after \mathcal{O}(M\log M) precomputation with M = 2\cdot 10^6.
CODE:
Author's code (C++)
// à„
#include <bits/stdc++.h>
using namespace std;
#define pb push_back
#define ll long long
#define ld long double
#define vi vector<int>
#define ff first
#define ss second
#define pll pair<ll,ll>
#define vll vector<ll>
const ll f = 998244353;
const ll p = 31;
const ll N = 2e6 + 1;
ll fx(vector<ll> &v) {
ll ans=0;
for(ll i = 0; i < 4; i++) {
for(ll j = i+1; j < 4; j++) {
ans = max( ans , v[i] + v[j]);
}
}
return ans;
}
signed main(){
ios_base::sync_with_stdio(0);
cin.tie(0);
cout.tie(0);
vector<ll> isprime(N, 1), lp(N);
for(ll i=2;i<N;i++) {
if(isprime[i]==0)
continue;
lp[i] = i;
for(ll j=2*i;j<N;j+=i) {
isprime[j] = 0;
if(lp[j]==0)
lp[j] = i;
}
}
vector<ll> sums(N);
for(ll i = 1; i < N; i++) {
ll sum = 0, curr = i;
while(curr != 1) {
ll t = lp[curr];
while(curr % t == 0) {
curr /= t;
sum += 1;
}
}
sums[i] = sum;
}
ll t=1;
cin>>t;
while(t--) {
ll n, m, k;
cin >> n >> m >> k;
vector<ll> a(n + 1),vv(n+1);
ll ma = 0, sum = 0;
for(ll i = 1; i <= n; i++) {
cin >> a[i];
sum += a[i];
vv[i] = sums[a[i]];
}
sort(vv.begin(), vv.end());
ll ans = sum - vv[n] - vv[n-1] ;
ll m_take = 0;
vector<ll> ma_val(31,-1);
for(ll i = m; i >= max(1LL , m-40) ; i--) {
if(ma_val[sums[i]] == -1)
ma_val[sums[i]] = i;
}
for(ll i = 0; i < 31; i++) {
ll x = ma_val[i];
if (x == -1) {
continue;
}
if(k==1) {
ans = max(ans , sum + x - max(i, vv[n-1])) - vv[n];
continue;
}
m_take = max ( m_take , x);
vector<ll> v2 = {vv[n], vv[n-1], i, i};
ll sum1 = sum + x + m_take * (k-1);
ans = max( ans , sum1 - fx(v2));
for(ll j = i+1; j < 31; j++) {
ll y = ma_val[j];
if (y == -1) {
continue;
}
vector<ll> v1 = {vv[n], vv[n-1], i, j};
ll sum1 = sum + x + y + m_take * (k-2);
ans = max( ans , sum1 - fx(v1));
}
}
cout << ans <<"\n";
}
}
Tester's code (C++)
//****************************Template Begins****************************//
// Header Files
#include <bits/stdc++.h>
#include<iostream>
#include<algorithm>
#include<vector>
#include<utility>
#include<set>
#include<unordered_set>
#include<list>
#include<iterator>
#include<deque>
#include<queue>
#include<stack>
#include<set>
#include<bitset>
#include<map>
#include<unordered_map>
#include<stdio.h>
#include<complex>
#include<math.h>
#include<chrono>
#include<cstring>
#include<string>
// Header Files End
using namespace std;
#define fio ios_base::sync_with_stdio(false);cin.tie(NULL);cout.tie(NULL)
#define ll long long
#define umap unordered_map
#define uset unordered_set
#define lb lower_bound
#define ub upper_bound
#define fo(i,a,b) for(i=a;i<b;i++)
#define all(v) (v).begin(),(v).end()
#define all1(v) (v).begin()+1,(v).end()
#define allr(v) (v).rbegin(),(v).rend()
#define allr1(v) (v).rbegin()+1,(v).rend()
#define sort0(v) sort(all(v))
typedef pair<int, int> pii;
typedef vector<int> vi;
typedef vector<ll> vll;
typedef pair<ll, ll> pll;
#define max3(a,b,c) max(max((a),(b)),(c))
#define max4(a,b,c,d) max(max((a),(b)),max((c),(d)))
#define min3(a,b,c) min(min((a),(b)),(c))
#define min4(a,b,c,d) min(min((a),(b)),min((c),(d)))
#define pb push_back
#define ppb pop_back
#define mp make_pair
#define inf 9999999999999
#define endl '\n'
#include "ext/pb_ds/assoc_container.hpp"
#include "ext/pb_ds/tree_policy.hpp"
using namespace __gnu_pbds;
template<class T>
using ordered_set = tree<T, null_type, less<T>, rb_tree_tag, tree_order_statistics_node_update> ;
template<class key, class value, class cmp = std::less<key>>
// find_by_order(k) returns iterator to kth element starting from 0;
// order_of_key(k) returns count of elements strictly smaller than k;
using ordered_map = tree<key, value, cmp, rb_tree_tag, tree_order_statistics_node_update>;
const ll mod1 = 998244353;
const ll mod = 1e9 + 7;
const ll MOD = 1e18 + 1e16;
ll mod_mul(ll a, ll b) {a = a % mod; b = b % mod; return (((a * b) % mod) + mod) % mod;}
ll inv(ll i) {if (i == 1) return 1; return (mod - ((mod / i) * inv(mod % i)) % mod) % mod;}
ll gcd(ll a, ll b) { if (b == 0) return a; return gcd(b, a % b);}
ll pwr(ll a, ll b) {a %= mod; ll res = 1; while (b > 0) {if (b & 1) res = res * a % mod; a = a * a % mod; b >>= 1;} return res;}
//****************************Template Ends*******************************//
mt19937 rng(chrono::steady_clock::now().time_since_epoch().count());
const auto start_time = chrono::high_resolution_clock::now();
void output_run_time() {
// will work for ac,cc&&cf.
#ifndef ONLINE_JUDGE
auto end_time = chrono::high_resolution_clock::now();
chrono::duration<double> diff = end_time - start_time;
cout << "\n\n\nTime Taken : " << diff.count();
#endif
}
vll val(2000010, 1), dp(2000010, 0);
vector<vll> max1(30);
void seive()
{
for (ll i = 2; i < 2000010; i++)
{
if (val[i] == 1)
{
val[i] = i;
for (ll j = i * i; j < 2000010; j += i)
val[j] = i;
}
}
}
void pre()
{
for (ll i = 2; i < 2000010; i++)
{
ll x = i;
ll y = val[x];
while (y != 1)
{
while (x % y == 0)
{
x /= y;
dp[i]++;
}
y = val[x];
}
max1[dp[i]].pb(i);
}
}
int main() {
fio;
ll t;
cin >> t;
seive();
pre();
ll i, j;
while (t--)
{
ll n, m, k;
cin >> n >> m >> k;
ll a[n], i, mx1 = 0, mx2 = 0, sum = 0, j;
fo(i, 0, n)
{
cin >> a[i];
sum += a[i];
if (dp[a[i]] > mx1)
{
mx2 = mx1;
mx1 = dp[a[i]];
}
else if (dp[a[i]] > mx2)
{
mx2 = dp[a[i]];
}
}
vll gg(30, -1);
gg[0] = 1;
fo(i, 1, 30)
{
ll idx = ub(all(max1[i]), m) - max1[i].begin();
if (idx)
gg[i] = max1[i][idx - 1];
}
ll ans = sum - mx1 - mx2, pp1 = mx1 + mx2;
fo(i, 0, 30)
{
if (gg[i] == -1)
continue;
ll pp2 = mx1 + i;
ll pp3 = 2 * i;
ans = max(ans, sum + gg[i] * k - max({pp1, pp2, pp3}));
}
cout << ans << endl;
}
// output_run_time();
return 0;
}
//remove #define endl '\n' for lleractive problems
Editorialist's code (Python)
M = 3 * 10**6
val = [0]*M
for n in range(2, M):
if val[n] > 0: continue
for x in range(n, M, n):
val[x] = 1 + val[x // n]
for _ in range(int(input())):
n, m, k = map(int, input().split())
a = list(map(int, input().split()))
suma = sum(a)
max1, max2 = -1, -1
for i in range(n):
a[i] = val[a[i]]
if a[i] >= max1:
max2 = max1
max1 = a[i]
else: max2 = max(max2, a[i])
ans = 0
for i in range(25):
if m-i <= 0: break
cursm = k*(m-i)
curval = val[m-i]
ans = max(ans, suma + cursm - max(max1, curval) - max(max2, curval))
print(ans)