A tutorial on Suffix Arrays

Thanks for this tutorial! I had issues with arrays while working with my coursework help website. This post helped to solve my problems.

Can anyone tell me how to print the final answer after verifying locally? Iā€™m trying to learn this algorithm from last one week but no success till now and at this point I donā€™t feel so good. Please help me understand how matrix P is used.

GATTACA
4
1
6
5
0
3
2

Why suffix array for this string is that ? It should be 6 at the first !!!

Nice tutorial for start in suffix arrays.

And I really recommend Skew Algorithm. Itā€™s just so awesome and magic, that it takes my breath away first time I saw it :slight_smile:

2 Likes

I am glad you liked it @michal27 :slight_smile: I will definitely try to learn Skewā€™s Algorithm in a not so distant future :slight_smile:

1 Like

Awesome post. Keep it up. :slight_smile:

1 Like

Thank you very much sir! Itā€™s always great to see our work appreciated and obviously, @gamabunta also helped me a lot here :slight_smile:

1 Like

Yes, obviously, otherwise it wouldnt terminate properly :slight_smile:
Iā€™ve seen this code snippet on a paper, but, Iā€™ve fixed it now :smiley:

Nice Tutorial @kuruma ā€¦

thanks, fixed

Nice explanation.
Another very clean explanation: c++ - Suffix Array Algorithm - Stack Overflow

4 Likes

please read the below answer

I was searching for it. Thank you.

1 Like

@kuruma , please explain what you are trying to do in this code:

for(i=0; i < N; i++)
P[stp][L[i].p] =i> 0 && L[i].nr[0]==L[i-1].nr[0] && L[i].nr[1] == L[i- 1].nr[1] ? P[stp][L[i-1].p] : i;

ā€œssissippiā€ suffix must be after ā€œssippiā€ suffix after sort considering first four characters.

how is sissippi has lower sort index(before) than sippi after sorting for first four charactersā€¦??
suffix array is really troubling me.Please sm1 help!

Use radix sort to get in done in O(nlogn)

Here is python implementation using inbuilt sort:-

def suffix_array(a):
    n, l = len(a), 0
    sa = [[a[i],i] for i in range(n)]
    pos_map = [None]*n
    while l < 2*n:
        sa.sort(key=lambda x: x[0])
        rank, last = 0, sa[0][0]
        for i in range(n):
            pos_map[sa[i][1]] = i
            if sa[i][0] != last:
                rank += 1
                last = sa[i][0]
            sa[i][0] = rank
        new_tuple = [(sa[i][0], sa[pos_map[sa[i][1]+l]][0] if sa[i][1]+l < n else -1) for i in range(n)]
        sa = [[new_tuple[i],sa[i][1]] for i in range(n)]
        l = 1 if not l else l << 1
    return [sa[i][1] for i in range(n)]