# Help with ukkonen's algorithm

I’m trying to use (almost a copy paste but rewritten) Segrey Makagonov’s code for Ukken’s algorithm for suffix tree construction and then dfs to construct suffix array.
However for N \sim 10^6 my program uses \sim 1.4 GB of space, and when used on problems on CF it gives memory limit exceeded.
Can someone suggest some changes so that I can bring the memory usage down?
Thanks.
This is the code I’m using:

#include <bits/stdc++.h>
using namespace std;
#define MAXN 1000000 //change as needed
#define alphabet 256 //(use 26 for lowercase latin etc)

//suffix tree implementation mostly based on sergey makagonov's work

const int upper_limit = 1 << 25;

int root; //pos of root
int rem, active_node, active_edge, active_length;

struct node{
//corresponds also to edge ending at this node
int start;
int end;
int next[alphabet];

int edge_length(){
return min(end,pos+1) - start; //distance from start
}
};

node suffix_tree[2*MAXN];
char text[MAXN];

int new_node(int start, int end = upper_limit){
node n;
n.start = start;
n.end = end;
for(int i = 0; i < alphabet; i++)
n.next[i] = 0;
}

char active_edg(){
return text[active_edge];
}

if(needSL > 0) suffix_tree[needSL].suffix_link = nod;
needSL = nod;
}

bool walk_down(int nod){
if(active_length >= suffix_tree[nod].edge_length()){
active_edge += suffix_tree[nod].edge_length();
active_length -= suffix_tree[nod].edge_length();
active_node = nod;
return true;
}
return false;
}

void suffix_tree_init(){
needSL = 0, last_added = 0, pos = -1, rem = 0, active_node = 0;
active_edge = 0, active_length = 0;
root = active_node = new_node(-1,-1);
}

void suffix_tree_extend(char c){
text[++pos] = c;
needSL = 0;
rem++;
while(rem > 0){
if(active_length == 0) active_edge = pos;
if(suffix_tree[active_node].next[active_edg()] == 0){
int leaf = new_node(pos);
suffix_tree[active_node].next[active_edg()] = leaf;
}
else{
int nxt = suffix_tree[active_node].next[active_edg()];
if(walk_down(nxt)) continue;
if(text[suffix_tree[nxt].start + active_length] == c){
active_length++;
break;
}
int split = new_node(suffix_tree[nxt].start, suffix_tree[nxt].start + active_length);
suffix_tree[active_node].next[active_edg()] = split;
int leaf = new_node(pos);
suffix_tree[split].next[c] = leaf;
suffix_tree[nxt].start += active_length;
suffix_tree[split].next[text[suffix_tree[nxt].start]] = nxt;
}
rem--;
if(active_node == root && active_length > 0){
active_length--;
active_edge = pos - rem + 1;
}
else{
}
}
}

int suffix_array[MAXN];
int temp, temp2, ins = 0;

void dfs(int cur, int curlen){
//cout << suffix_tree[cur].start << ' ' << suffix_tree[cur].end << ' ' << curlen << endl;
temp = 0;
if(suffix_tree[cur].next['$']){ temp2 = suffix_tree[cur].next['$'];
dfs(temp2, curlen + suffix_tree[cur].end - suffix_tree[cur].start);
}
for(char c = 'a'; c <= 'z'; c++){
if(suffix_tree[cur].next[c]){
//cout << c << endl;
temp2 = suffix_tree[cur].next[c];
dfs(temp2, curlen + suffix_tree[cur].end - suffix_tree[cur].start);
//cout << '~' << c << endl;
}
}
if(temp == 0){
suffix_array[ins++] = suffix_tree[cur].start - curlen;
}
temp = 1;
}

int main(){
ios_base::sync_with_stdio(false);
cin.tie(NULL);
string S;
cin >> S;
suffix_tree_init();
for(int i = 0; i < S.size(); i++)
suffix_tree_extend(S[i]);
suffix_tree_extend('$'); dfs(root,0); for(int i = 1; i <= S.size(); i++) cout << suffix_array[i] << ' '; cout << endl; return 0; }  Do you really need an alphabet of 256 characters? 1 Like I think not. Wait, I’ll check with a smaller size and revert. 1 Like So when I changed my next[] size to 63 (for a-z + A-Z + 0-9 + ‘$’) it uses 356 MB for 10^6, and when I changed it to 27 for ‘a’-‘z’ and ‘$’ it is 150ish MB which is still quite high, though it helped. What else can I do to reduce the memory usage? Thanks btw. The added lines : #define alphabet 63 static inline int indx(char c){ if(c == '$') return 0;
else if(c >= 'a' && c <= 'z') return c-'a'+1;
else if (c >= 'A' && c <= 'Z') return c-'A'+27;
else return c-'0'+53;
}



Also, less I/O with disk => faster ! My program is now 2x faster. Thanks @ssjgz!
Though, it still uses 350 MB . Can you suggest further changes?

1 Like

Hmmm … nothing leaps out. I always use a vector for my equivalent of your next (so next contains no “empty entries”: it’s size is precisely the set of transitions leading from a Node), but then there’s a sharp runtime vs memory tradeoff

2 Likes

I still get memory limit exceeded on this one
https://codeforces.com/contest/1200/submission/67508237
The limit seems to be 262 MB ish, and the string is alphanumeric and not only lowercase, meaning that I can’t trim my array down further.
I don’t see where else I can cut memory usage…
Maybe I should use arrays instead of structs but that means rewriting the whole thing from scratch

1 Like

Wait - do you actually need fancy suffix-tree stuff for this Problem? If i’m understanding it correctly, you just merge the words in order, and if merged is the merged string so far, and w is the next word for merge, then the time taken to merge w into merged to get the next merged string is just |w|, so the whole algorithm is linear in the sum over all w of |w|, which is 10^6.

1 Like

Yeah I don’t think I need suffix trees. Though, on problems where I would need them, this would still pop up as an issue.

I’ve done plenty of string Problems on Hackerrank, and I don’t think I’ve encountered one where memory usage was an issue

Edit:

Hmmm … actually, not so sure if there is a way to do this without suffix-trees, now.

2 Likes

Can you post your implementation or the link to a good one?
Edit : BTW I tried using unordered_maps with a custom hash instead of arrays, but it takes way too long.

Yeah, I misjudged and don’t have a proper solution at the moment.

This might work, though I absolutely hate this approach as it is more of a “heuristic that is astronomically unlikely to fail” than an algorithm, but what the hey:

// Simon St James (ssjgz) - 2019-12-24
//
// Solution to: https://codeforces.com/contest/1200/problem/E
//
#include <iostream>
#include <vector>

#include <cassert>

using namespace std;

template <typename T>
{
assert(cin);
}

//#define VERIFY_HASH // Debugging - all seems to be correct, though.

class HashedString
{
public:
void appendLetter(const char letter)
{
const int letterIndex = letter - 'a' + 1;
m_hash += letterIndex * m_powerOfPrime;
m_powerOfPrime *= largeRandomPrime;

#ifdef VERIFY_HASH
m_string += letter;
assertHashIsCorrect();
#endif
}
void prependLetter(const char letter)
{
const int letterIndex = letter - 'a' + 1;
m_hash *= largeRandomPrime;
m_hash += letterIndex * largeRandomPrime;

m_powerOfPrime *= largeRandomPrime;
#ifdef VERIFY_HASH
m_string = letter + m_string;
assertHashIsCorrect();
#endif
};
uint64_t hash() const
{
return m_hash;
}
#ifdef VERIFY_HASH
void assertHashIsCorrect()
{
uint64_t powerOfPrime = largeRandomPrime;
uint64_t correctHash = 0;
for (int i = 0; i < m_string.length(); i++)
{
const int letterIndex = m_string[i] - 'a' + 1;
correctHash += letterIndex * powerOfPrime;

powerOfPrime = powerOfPrime * largeRandomPrime;
}
cout << "String: " << m_string << " correctHash: " << correctHash << " m_hash: " << m_hash << endl;
assert(correctHash == m_hash);
}
#endif
private:
#ifdef VERIFY_HASH
string m_string;
#endif
uint64_t m_hash = 0;
static constexpr uint64_t largeRandomPrime = 189'385'063'559;
uint64_t m_powerOfPrime = largeRandomPrime;
};

string mergeWords(const vector<string>& wordsToMerge)
{
string merged;

for (const auto& wordToMerge : wordsToMerge)
{
HashedString hashedPrefixOfNew;
HashedString hashedSuffixOfMerged;
int largestMatchLen = 0;
for (int i = 0; i < min(wordToMerge.length(), merged.length()); i++)
{
hashedPrefixOfNew.appendLetter(wordToMerge[i]);
hashedSuffixOfMerged.prependLetter(merged[merged.length() - 1 - i]);
// Yuck - assumes that strings have the same hash if and only if they
// are equivalent, which is mathematically provably false but
// enormously unlikely to be wrong in real-life.
//
// See https://www.hackerrank.com/challenges/gridland-provinces/problem for a
// Problem where this sort of logic is the basis for the official Editorial
// solution.
if (hashedPrefixOfNew.hash() == hashedSuffixOfMerged.hash())
{
largestMatchLen = i + 1;
}
}

merged += wordToMerge.substr(largestMatchLen);
}

return merged;
}

int main(int argc, char* argv[])
{
ios::sync_with_stdio(false);

vector<string> wordsToMerge(numWords);
for (auto& word : wordsToMerge)
{
}

cout << mergeWords(wordsToMerge) << endl;

assert(cin);
}


I haven’t tested it beyond the sample testcases.

Edit:

The naive solution would be, for each 1 \le i \le |\textit{wordToMerge}|, to try and see if the suffix of \textit{merged} of length i is equal to the prefix of \textit{wordToMerge} - this sounds very, very similar to the kind of situation that KMP helps with.

2 Likes

Thanks.
I tried implementing the cp algorithms code, with dfs for suffix array.
113 MB, somewhat better, but the code is slow: 1.8 seconds for 10^6.
I didn’t even try to read the compressed implementation given there on cp algos, its just so repelling.
Edit: the compressed implementation there has only 26 slots, i.e. no ‘\$’ char support meaning no easy way to do dfs to find “leaves”. I don’t want to read the implementation and change it more, its already so repelling.

1 Like

Wow, look at this.

I changed the endl’s to \n’s here, added some small tweaks and rewrote the lcp array function as it seemed to give a wrong output, and now for 10^6 it takes 0.3 seconds and 16 MB space. WOW.

1 Like