I’m trying to use (almost a copy paste but rewritten) Segrey Makagonov’s code for Ukken’s algorithm for suffix tree construction and then dfs to construct suffix array.
However for N \sim 10^6 my program uses \sim 1.4 GB of space, and when used on problems on CF it gives memory limit exceeded.
Can someone suggest some changes so that I can bring the memory usage down?
Thanks.
This is the code I’m using:
#include <bits/stdc++.h>
using namespace std;
#define MAXN 1000000 //change as needed
#define alphabet 256 //(use 26 for lowercase latin etc)
//suffix tree implementation mostly based on sergey makagonov's work
const int upper_limit = 1 << 25;
int root; //pos of root
int last_added, pos, needSL;
int rem, active_node, active_edge, active_length;
struct node{
//corresponds also to edge ending at this node
int start;
int end;
int suffix_link;
int next[alphabet];
int edge_length(){
return min(end,pos+1) - start; //distance from start
}
};
node suffix_tree[2*MAXN];
char text[MAXN];
int new_node(int start, int end = upper_limit){
node n;
n.start = start;
n.end = end;
n.suffix_link = 0;
for(int i = 0; i < alphabet; i++)
n.next[i] = 0;
suffix_tree[++last_added] = n;
return last_added;
}
char active_edg(){
return text[active_edge];
}
void add_suffix_link(int nod){
if(needSL > 0) suffix_tree[needSL].suffix_link = nod;
needSL = nod;
}
bool walk_down(int nod){
if(active_length >= suffix_tree[nod].edge_length()){
active_edge += suffix_tree[nod].edge_length();
active_length -= suffix_tree[nod].edge_length();
active_node = nod;
return true;
}
return false;
}
void suffix_tree_init(){
needSL = 0, last_added = 0, pos = -1, rem = 0, active_node = 0;
active_edge = 0, active_length = 0;
root = active_node = new_node(-1,-1);
}
void suffix_tree_extend(char c){
text[++pos] = c;
needSL = 0;
rem++;
while(rem > 0){
if(active_length == 0) active_edge = pos;
if(suffix_tree[active_node].next[active_edg()] == 0){
int leaf = new_node(pos);
suffix_tree[active_node].next[active_edg()] = leaf;
add_suffix_link(active_node);
}
else{
int nxt = suffix_tree[active_node].next[active_edg()];
if(walk_down(nxt)) continue;
if(text[suffix_tree[nxt].start + active_length] == c){
active_length++;
add_suffix_link(active_node);
break;
}
int split = new_node(suffix_tree[nxt].start, suffix_tree[nxt].start + active_length);
suffix_tree[active_node].next[active_edg()] = split;
int leaf = new_node(pos);
suffix_tree[split].next[c] = leaf;
suffix_tree[nxt].start += active_length;
suffix_tree[split].next[text[suffix_tree[nxt].start]] = nxt;
add_suffix_link(split);
}
rem--;
if(active_node == root && active_length > 0){
active_length--;
active_edge = pos - rem + 1;
}
else{
active_node = suffix_tree[active_node].suffix_link > 0? suffix_tree[active_node].suffix_link : root;
}
}
}
int suffix_array[MAXN];
int temp, temp2, ins = 0;
void dfs(int cur, int curlen){
//cout << suffix_tree[cur].start << ' ' << suffix_tree[cur].end << ' ' << curlen << endl;
temp = 0;
if(suffix_tree[cur].next['$']){
temp2 = suffix_tree[cur].next['$'];
dfs(temp2, curlen + suffix_tree[cur].end - suffix_tree[cur].start);
}
for(char c = 'a'; c <= 'z'; c++){
if(suffix_tree[cur].next[c]){
//cout << c << endl;
temp2 = suffix_tree[cur].next[c];
dfs(temp2, curlen + suffix_tree[cur].end - suffix_tree[cur].start);
//cout << '~' << c << endl;
}
}
if(temp == 0){
suffix_array[ins++] = suffix_tree[cur].start - curlen;
}
temp = 1;
}
int main(){
ios_base::sync_with_stdio(false);
cin.tie(NULL);
string S;
cin >> S;
suffix_tree_init();
for(int i = 0; i < S.size(); i++)
suffix_tree_extend(S[i]);
suffix_tree_extend('$');
dfs(root,0);
for(int i = 1; i <= S.size(); i++)
cout << suffix_array[i] << ' ';
cout << endl;
return 0;
}
So when I changed my next[] size to 63 (for a-z + A-Z + 0-9 + ‘$’) it uses 356 MB for 10^6, and when I changed it to 27 for ‘a’-‘z’ and ‘$’ it is 150ish MB which is still quite high, though it helped.
What else can I do to reduce the memory usage?
Thanks btw.
The added lines :
#define alphabet 63
static inline int indx(char c){
if(c == '$') return 0;
else if(c >= 'a' && c <= 'z') return c-'a'+1;
else if (c >= 'A' && c <= 'Z') return c-'A'+27;
else return c-'0'+53;
}
Also, less I/O with disk => faster ! My program is now 2x faster. Thanks @ssjgz!
Though, it still uses 350 MB . Can you suggest further changes?
Hmmm … nothing leaps out. I always use a vector for my equivalent of your next (so next contains no “empty entries”: it’s size is precisely the set of transitions leading from a Node), but then there’s a sharp runtime vs memory tradeoff
The limit seems to be 262 MB ish, and the string is alphanumeric and not only lowercase, meaning that I can’t trim my array down further.
I don’t see where else I can cut memory usage…
Maybe I should use arrays instead of structs but that means rewriting the whole thing from scratch
Wait - do you actually need fancy suffix-tree stuff for this Problem? If i’m understanding it correctly, you just merge the words in order, and if merged is the merged string so far, and w is the next word for merge, then the time taken to merge w into merged to get the next merged string is just |w|, so the whole algorithm is linear in the sum over all w of |w|, which is 10^6.
Can you post your implementation or the link to a good one?
Edit : BTW I tried using unordered_maps with a custom hash instead of arrays, but it takes way too long.
Yeah, I misjudged and don’t have a proper solution at the moment.
This might work, though I absolutely hate this approach as it is more of a “heuristic that is astronomically unlikely to fail” than an algorithm, but what the hey:
// Simon St James (ssjgz) - 2019-12-24
//
// Solution to: https://codeforces.com/contest/1200/problem/E
//
#include <iostream>
#include <vector>
#include <cassert>
using namespace std;
template <typename T>
T read()
{
T toRead;
cin >> toRead;
assert(cin);
return toRead;
}
//#define VERIFY_HASH // Debugging - all seems to be correct, though.
class HashedString
{
public:
void appendLetter(const char letter)
{
const int letterIndex = letter - 'a' + 1;
m_hash += letterIndex * m_powerOfPrime;
m_powerOfPrime *= largeRandomPrime;
#ifdef VERIFY_HASH
m_string += letter;
assertHashIsCorrect();
#endif
}
void prependLetter(const char letter)
{
const int letterIndex = letter - 'a' + 1;
m_hash *= largeRandomPrime;
m_hash += letterIndex * largeRandomPrime;
m_powerOfPrime *= largeRandomPrime;
#ifdef VERIFY_HASH
m_string = letter + m_string;
assertHashIsCorrect();
#endif
};
uint64_t hash() const
{
return m_hash;
}
#ifdef VERIFY_HASH
void assertHashIsCorrect()
{
uint64_t powerOfPrime = largeRandomPrime;
uint64_t correctHash = 0;
for (int i = 0; i < m_string.length(); i++)
{
const int letterIndex = m_string[i] - 'a' + 1;
correctHash += letterIndex * powerOfPrime;
powerOfPrime = powerOfPrime * largeRandomPrime;
}
cout << "String: " << m_string << " correctHash: " << correctHash << " m_hash: " << m_hash << endl;
assert(correctHash == m_hash);
}
#endif
private:
#ifdef VERIFY_HASH
string m_string;
#endif
uint64_t m_hash = 0;
static constexpr uint64_t largeRandomPrime = 189'385'063'559;
uint64_t m_powerOfPrime = largeRandomPrime;
};
string mergeWords(const vector<string>& wordsToMerge)
{
string merged;
for (const auto& wordToMerge : wordsToMerge)
{
HashedString hashedPrefixOfNew;
HashedString hashedSuffixOfMerged;
int largestMatchLen = 0;
for (int i = 0; i < min(wordToMerge.length(), merged.length()); i++)
{
hashedPrefixOfNew.appendLetter(wordToMerge[i]);
hashedSuffixOfMerged.prependLetter(merged[merged.length() - 1 - i]);
// Yuck - assumes that strings have the same hash if and only if they
// are equivalent, which is mathematically provably false but
// enormously unlikely to be wrong in real-life.
//
// See https://www.hackerrank.com/challenges/gridland-provinces/problem for a
// Problem where this sort of logic is the basis for the official Editorial
// solution.
if (hashedPrefixOfNew.hash() == hashedSuffixOfMerged.hash())
{
largestMatchLen = i + 1;
}
}
merged += wordToMerge.substr(largestMatchLen);
}
return merged;
}
int main(int argc, char* argv[])
{
ios::sync_with_stdio(false);
const int numWords = read<int>();
vector<string> wordsToMerge(numWords);
for (auto& word : wordsToMerge)
{
word = read<string>();
}
cout << mergeWords(wordsToMerge) << endl;
assert(cin);
}
I haven’t tested it beyond the sample testcases.
Edit:
The naive solution would be, for each 1 \le i \le |\textit{wordToMerge}|, to try and see if the suffix of \textit{merged} of length i is equal to the prefix of \textit{wordToMerge} - this sounds very, very similar to the kind of situation that KMP helps with.
Thanks.
I tried implementing the cp algorithms code, with dfs for suffix array.
113 MB, somewhat better, but the code is slow: 1.8 seconds for 10^6.
I didn’t even try to read the compressed implementation given there on cp algos, its just so repelling.
Edit: the compressed implementation there has only 26 slots, i.e. no ‘$’ char support meaning no easy way to do dfs to find “leaves”. I don’t want to read the implementation and change it more, its already so repelling.
I changed the endl’s to \n’s here, added some small tweaks and rewrote the lcp array function as it seemed to give a wrong output, and now for 10^6 it takes 0.3 seconds and 16 MB space. WOW.