A tutorial on Suffix Arrays

awesome explaination dude… seriously… finally i got the suffix array… :slight_smile:

awesome (y)

1 Like

very Good… :slight_smile:

1 Like

Now, we can call quick-sort and sort the suffixes according to their first 4 characters by using the 2-tuples we constructed above! The result would be

can anyone please explain the above line???qsort is called in which tuple…

1 Like

how tuples r used for sorting please explain???

1 Like

Really,easy to understand from this tutorial!!!:slight_smile:

16 Likes

Now, we can call quick-sort and sort the suffixes according to their first
4 characters by using the 2-tuples we
constructed above! The result would
be:

6 6: sissippi
7 3:
sippi
8 5: ssissippi
9 2: ssippi

Can anyone explain me the result of above step. How are the suffixes sorted according to first 4 characters after calling quick sort.
I mean, how is ‘sissippi’ is having a lower sort-index then ‘sippi’ and simmilarly how is ‘ssissippi’ having a lower sort-index than ‘ssippi’ ?
I think the correct order should be:

6 6: sippi
7 3:
sissippi
8 5: ssippi
9 2: ssissippi

2 Likes

In the naive approach, shouldn’t the loop go from i=0 to s.size() rather than i=0 to s.size()-1 ?

1 Like

well explained, but I think there should be some correction in the table given below:
Sort-Index Suffix-Index

0       10: i
1        7: ippi
2        1: ississippi
2        4: issippi
3        0: mississippi
4        9: pi
5        8: ppi
6        3: sissippi
7        6: sippi
8        2: ssissippi
9        5: ssippi
till (compared)4th character,  8 sort_index should be "ssippi" and 9th should "ssissippi" same in 6 & 7 sort_index
1 Like

Shouldn’t it be
“We can use the SortIndex array we constructed above to find the Longest Common Prefix, between any two suffixes”
instead of
“We can use the SortIndex array we constructed above to find the Longest Common Prefix, between any two prefixes.”

1 Like

I am a beginner. Very well written sir. It was a deep and helpful insight.

1 Like

I think the final answer given is wrong. It should be

sort-Index Suffix-Index

0       10: i
1        7: ippi
2        1: ississippi
2        4: issippi
3        0: mississippi
4        9: pi
5        8: ppi
6        6: sippi
7        3: sissippi
8        5: ssippi
9        2: ssissippi

This was awesome!
Thanks Bruno.

while sorting the suffixes according to the first 4 characters, why is “3” not assigned in the second tuple ??

shouldn’t in the naive approach

for(int i = 0; i < s.size();i++)
{
    m[s.substr(i,s.size()-i)] = i;
    v.push_back(s.substr(i,s.size()-i));
}

should be:

for(int i = 0; i < s.size();i++)
{
    m[s.substr(i,s.size()-1)] = i;
    v.push_back(s.substr(i,s.size()-1));
}

???

notice the size() - 1 instead of size()-i

very good explanation…

Another nice tutorial :slight_smile:

Really Helpful! Cheers :smiley:

I am getting wrong suffix array for input
ebbacb

Actual output is

5 3 2 0 4 1

While It should be

3 5 2 1 4 0

The N((log N)^2) says “we can use the property that all the strings are related” . But it is not very clear to me that what we are using , can’t we use the same thing for normal string comparison ?

@gamabunta : Will you please help me in explaining that ?