Problem Code- CN04
Learn all about suffix arrays construction here:
Finding Total Number of distinct Substrings.
Well how do we do this ?
We see that , any string , let us say
ABC will give following distinct substring ,
Which is equivalent to choosing an end point and a start point in a string, and what is left is a simple matter of finding the number of ways you can choose a start and end point . Which is nC2 (where n is the string length ) .
Problem happens when we have repeated characters . What will you do then ? that is when suffix arrays come into play .
Now if I am given a string , axyz , and let us say , I ask you how many strings can you make with this string such that they start from the first letter. Then the answer will be 4 ( a , ax, axy, axyz ) .
So , I have a string of length 7 , then I have 7 suffixes (excluding NULL , I am not counting it because it’s length is 0 , does not contribute towards any distinct substring ) . So , going by the above logic , I will 7+6+5+4+3+2+1 distinct strings , which is equivalent to nC2 which was also the result of above discussion .
Unfortunately for us , this result doesn’t hold for the strings with repeated characters.
Take this one
Suffix Possible strings such that they start from first letter of suffix
Now as you see strings a,b,aab,etc are repeated but we also know that if we sort the above suffixes , those strings which are common , there length is given in Lowest Common Prefix .
So , the total number of distinct Substrings from a given string is given by
nC2 – Sum of All LCP
And that is the answer, as simple as that.