CN04- Editorial

Problem Code- CN04

Difficulty: Medium

Prerequisites:
Suffix Arrays

Explanation:
Learn all about suffix arrays construction here:

suffix arrays

Finding Total Number of distinct Substrings.
Well how do we do this ?
We see that , any string , let us say
ABC will give following distinct substring ,

A

B

C

AB

BC

ABC

Which is equivalent to choosing an end point and a start point in a string, and what is left is a simple matter of finding the number of ways you can choose a start and end point . Which is nC2 (where n is the string length ) .

Problem happens when we have repeated characters . What will you do then ? that is when suffix arrays come into play .

Now if I am given a string , axyz , and let us say , I ask you how many strings can you make with this string such that they start from the first letter. Then the answer will be 4 ( a , ax, axy, axyz ) .

So , I have a string of length 7 , then I have 7 suffixes (excluding NULL , I am not counting it because it’s length is 0 , does not contribute towards any distinct substring ) . So , going by the above logic , I will 7+6+5+4+3+2+1 distinct strings , which is equivalent to nC2 which was also the result of above discussion .

Unfortunately for us , this result doesn’t hold for the strings with repeated characters.

Take this one

S= “aabaab”

Suffix Possible strings such that they start from first letter of suffix

b b
ab a,ab
aab a,aa,aab
baab b,ba,baa,baab
abaab a,ab,aba,abaa,abaab
aabaab a,aa,aab,aaba,aabaa,aabaab

Now as you see strings a,b,aab,etc are repeated but we also know that if we sort the above suffixes , those strings which are common , there length is given in Lowest Common Prefix .

So , the total number of distinct Substrings from a given string is given by

nC2 – Sum of All LCP

And that is the answer, as simple as that.