PROBLEM LINK:
Practice
Div-2 Contest
Div-1 Contest
Author: Viktar Sharepa
Tester: Danya Smelskiy
Editorialist: Viktar Sharepa
DIFFICULTY:
Medium
PREREQUISITES:
Aho–Corasick, trie, prefix-sum.
PROBLEM:
We are given an array of strings S, the beauty of each string t in S is given by a positive integer b_t.
The pleasure of a string C is defined by {\sum\limits_{t \in S}\text{(number of occurrences of }t\text{ in C)}\cdot b_t}.
Given two strings A, B, find the string with maximum pleasure that can be constructed by taking one substring of A and adding one substring of B to its end.
QUICK EXPLANATION:
- The string with maximum pleasure can be constructed by taking a non-empty prefix of A and a non-empty suffix of B.
- Let’s calculate the answer for all the {|A| \cdot |B|} possibilities. Suppose we choose {A[0..i]} and {B[j..|B|-1]}. There are three types of string occurrences:
- Occurrences that are completely in {A[0..i]}. We can precompute these sums for each i using the Aho-Corasick algorithm.
- Occurrences that start in {A[0..i]} and end in {B[j..|B|-1]}. All such occurrences end in {B[j..j+24]} so we can take 25 steps from trie node corresponding to {A[0..i]} in the Aho-Corasick automaton. We can precompute such nodes along with the sums of the first case.
- Occurrences that are completely in {B[j..|B|-1]}. Note that we have already counted the ones that are completely in {B[j..j+24]}. All that is left to count are occurrences that end in {B[j+25..|B|-1]}. We can do it similarly to the first case
EXPLANATION:
Given a string C = {A[k..i]} + {B[j..l]} we can construct another string D = {A[0..i]} + {B[j..|B|-1]}. Note that any substring of C is also a substring of D, since all the beauties are positive, the pleasure of D is not smaller than the pleasure of C. That means that the string with maximum pleasure always can be constructed by taking a non-empty prefix of A and a non-empty suffix of B.
There are {|A| \cdot |B|} ways of taking a prefix of A and a suffix of B. To quickly calculate the pleasure of each resulting string, we’ll use the following data structures:
- Let’s build an automaton on all the strings from S using the Aho–Corasick algorithm. For each node u let
sum[u]
be the sum of beauties of all strings which nodes are ancestors of u in the suffix link tree. - Let
prefASum[i]
be the sum of beauties of all occurrences of strings from S in the prefix {A[0..i]} andprefANode[i]
be the automaton node corresponding to the prefix {A[0..i]}. The arraysprefASum
andprefANode
can be calculated iterating over A character by character and moving on the automaton.
C++ code
vector<long long> prefASum(a.size());
vector<int> prefANode(a.size());
for (int i = 0; i < a.size(); i++) {
ahoCorasick.move(a[i]);
prefASum[i] = ahoCorasick.getCurrentNodeSum();
if (i != 0) {
prefASum[i] += prefASum[i - 1];
}
prefANode[i] = ahoCorasick.currentNode;
}
- Similarly to the previous step let
suffBSum[i]
be the sum of beauties of all occurrences of strings from S in B that end in the suffix {B[i..|B|-1]}.
C++ code
vector<long long> suffBSum(b.size());
for (int i = 0; i < b.size(); i++) {
ahoCorasick.move(b[i]);
suffBSum[i] = ahoCorasick.getCurrentNodeSum();
}
for (int i = (int) b.size() - 2; i >= 0; i--) {
suffBSum[i] += suffBSum[i + 1];
}
Consider the string C = {A[0..i]} + {B[j..|B|-1]}. To calculate the pleasure of C, let’s analyze the three types of string occurrences in it:
- Occurrences that are completely in {A[0..i]}.
- Occurrences that start in {A[0..i]} and end in {B[j..|B|-1]}.
- Occurrences that are completely in {B[j..|B|-1]}.
- The sum of occurrences of the first type is equal to
prefASum[i]
. - Let’s start at
prefANode[i]
and move in the automaton character by character of the substring {B[j..j+24]}, keep adding thesum[node]
of each node that we visit. In this way we get the sum of beauties of all occurrences that lie in the concatenation of {A[0..i]} and end at {B[j..j+24]}, remember that the strings in S have a maximum lenght of 26. This will allow us to calculate the sum of all the occurrences of the second type and some occurrences of the third type. - All is left to add to current sum are occurrences of the third type that end somewhere in {B[j+25...|B|-1]}. Their sum is equal to
suffBSum[j + 25]
.
Time complexity per test case: build an automaton in {\mathcal{O}(\sum\limits_{t \in S}|t|)} + calculate prefASum
, prefANode
, suffBSum
in {\mathcal{O}(|A|+|B|)} + calculate the pleasure for each prefix and suffix pair in {\mathcal{O}(|A| \cdot |B| \cdot \max\limits_{t \in S}|t|)} = {\mathcal{O}(|A| \cdot |B| \cdot \max\limits_{t \in S}|t| + \sum\limits_{t \in S}|t|)}.
SOLUTIONS:
Setter's Solution
#include <bits/stdc++.h>
using namespace std;
template<int ALPHABET_SIZE, unsigned char MINIMAL_CHAR>
struct AhoCorasick {
static constexpr int NON_EXISTENT_NODE_ID = -1;
static constexpr int FAKE_NODE_ID = 0;
static constexpr int ROOT_ID = 1;
int currentNode;
vector<array<int, ALPHABET_SIZE>> edges;
vector<int> suffixLink;
vector<long long> sum;
explicit AhoCorasick(const vector<pair<string, int>> &a) : currentNode(ROOT_ID), edges(2),
suffixLink(2, FAKE_NODE_ID), sum(2, 0) {
edges[FAKE_NODE_ID].fill(ROOT_ID);
edges[ROOT_ID].fill(NON_EXISTENT_NODE_ID);
for (const auto &p : a) {
int node = ROOT_ID;
for (unsigned char c : p.first) {
c -= MINIMAL_CHAR;
if (edges[node][c] == -1) {
edges[node][c] = edges.size();
edges.emplace_back();
edges.back().fill(NON_EXISTENT_NODE_ID);
suffixLink.push_back(NON_EXISTENT_NODE_ID);
sum.push_back(0);
}
node = edges[node][c];
}
sum[node] += p.second;
}
queue<int> q;
q.push(ROOT_ID);
while (!q.empty()) {
int node = q.front();
if (suffixLink[node] != NON_EXISTENT_NODE_ID) {
sum[node] += sum[suffixLink[node]];
}
q.pop();
for (int i = 0; i < ALPHABET_SIZE; i++) {
int child = edges[node][i];
if (child == NON_EXISTENT_NODE_ID) {
edges[node][i] = edges[suffixLink[node]][i];
} else {
suffixLink[child] = edges[suffixLink[node]][i];
q.push(child);
}
}
}
}
void setNode(int node) {
currentNode = node;
}
void resetNode() {
setNode(ROOT_ID);
}
long long getCurrentNodeSum() {
return sum[currentNode];
}
void move(unsigned char c) {
c -= MINIMAL_CHAR;
currentNode = edges[currentNode][c];
}
};
void solve() {
string a, b;
cin >> a >> b;
int n;
cin >> n;
vector<pair<string, int>> s(n);
for (int i = 0; i < n; i++) {
cin >> s[i].first >> s[i].second;
}
AhoCorasick<26, 'a'> ahoCorasick(s);
vector<long long> prefASum(a.size());
vector<int> prefANode(a.size());
for (int i = 0; i < a.size(); i++) {
ahoCorasick.move(a[i]);
prefASum[i] = ahoCorasick.getCurrentNodeSum();
if (i != 0) {
prefASum[i] += prefASum[i - 1];
}
prefANode[i] = ahoCorasick.currentNode;
}
ahoCorasick.resetNode();
vector<long long> suffBSum(b.size());
for (int i = 0; i < b.size(); i++) {
ahoCorasick.move(b[i]);
suffBSum[i] = ahoCorasick.getCurrentNodeSum();
}
for (int i = (int) b.size() - 2; i >= 0; i--) {
suffBSum[i] += suffBSum[i + 1];
}
long long ans = 0;
for (int i = 0; i < a.size(); i++) {
for (int j = 0; j < b.size(); j++) {
long long cur = prefASum[i];
ahoCorasick.setNode(prefANode[i]);
for (int k = j; k <= j + 24 && k < b.size(); k++) {
ahoCorasick.move(b[k]);
cur += ahoCorasick.getCurrentNodeSum();
}
if (j + 25 < b.size()) {
cur += suffBSum[j + 25];
}
ans = max(ans, cur);
}
}
cout << ans << '\n';
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(nullptr);
cout.tie(nullptr);
#ifdef HOME
freopen("in.txt", "r", stdin);
#endif
int tests;
cin >> tests;
while (tests--) {
solve();
}
}
Tester's Solution
#include <iostream>
#include <string>
#include <cassert>
#include <cstdio>
#include <cstring>
#include <vector>
using namespace std;
int Go(int node_id, int next_char);
int get_link(int node_id);
const int N = 10'000 + 5;
const int NODES_CNT = N * 26 + 5;
string s[N];
int price[N];
int nodes = 0;
int nxt[NODES_CNT][26];
int go[NODES_CNT][26];
int link[NODES_CNT];
int node_value[NODES_CNT];
int anc_c[NODES_CNT];
int anc[NODES_CNT];
bool processed[NODES_CNT];
int node_price[NODES_CNT];
long long dp[1005];
void clear_node(int id) {
memset(nxt[id], 0, sizeof(nxt[id]));
memset(go[id], 255, sizeof(go[id]));
link[id] = -1;
node_value[id] = 0;
}
void init() {
nodes = 0;
clear_node(0);
anc[0] = -1;
}
void add_new_string(const string& w, int cost) {
int last = 0;
for (int i = 0; i < (int)w.size(); ++i) {
int c = w[i] - 'a';
if (!nxt[last][c]) {
nxt[last][c] = ++nodes;
anc_c[nodes] = c;
anc[nodes] = last;
clear_node(nodes);
}
last = nxt[last][c];
}
node_value[last] += cost;
return;
}
int Go(int id, int c) {
if (go[id][c] != -1)
return go[id][c];
if (nxt[id][c])
return go[id][c] = nxt[id][c];
if (id == 0)
return go[id][c] = 0;
return go[id][c] = Go(get_link(id), c);
}
int get_link(int id) {
if (link[id] != -1)
return link[id];
if (id == 0 || anc[id] == 0)
return link[id] = 0;
return link[id] = Go(get_link(anc[id]), anc_c[id]);
}
int process_node(int id) {
if (processed[id])
return node_price[id];
processed[id] = true;
node_price[id] = node_value[id];
node_price[id] += process_node(get_link(id));
return node_price[id];
}
void build() {
for (int i = 0; i <= nodes; ++i) {
processed[i] = false;
node_price[i] = 0;
}
for (int i = 0; i <= nodes; ++i) {
process_node(i);
}
}
void calc_dp(const string& b) {
int m = (int)b.size();
int last = 0;
for (int i = 1; i <= m; ++i) {
dp[i] = 0;
int c = b[i - 1] - 'a';
last = Go(last, c);
dp[i] = node_price[last];
}
dp[m + 1] = 0;
for (int i = m; i > 0; --i) {
dp[i] += dp[i + 1];
}
return;
}
void solve(int test_id) {
init();
string a, b;
cin >> a >> b;
int n, m;
cin >> n;
m = (int)b.size();
for (int i = 1; i <= n; ++i) {
cin >> s[i] >> price[i];
assert(s[i].size() <= 26);
add_new_string(s[i], price[i]);
}
build();
calc_dp(b);
int last = 0;
long long beauty = 0;
long long result = 0;
for (int i = 1; i <= (int)a.size(); ++i) {
int c = a[i - 1] - 'a';
last = Go(last, c);
beauty += node_price[last];
for (int j = 1; j <= m; ++j) {
int last_c = last;
long long beauty_c = beauty;
for (int k = 0; k < 26; ++k) {
int r = j + k;
if (r > m) break;
int cc = b[r - 1] - 'a';
last_c = Go(last_c, cc);
beauty_c += node_price[last_c];
}
if (j + 26 <= m)
beauty_c += dp[j + 26];
if (beauty_c > result) {
result = beauty_c;
}
}
}
cout << result << '\n';
return;
}
int main(int argc, const char * argv[]) {
ios_base::sync_with_stdio(false); cin.tie(0); cout.tie(0);
int tests;
cin >> tests;
assert(1 <= tests && tests <= 10);
for (int i = 0; i < tests; ++i) {
solve(i);
}
return 0;
}