What is wrong with the suffix array approach

bondoc73 · August 16, 2013, 4:25pm

I was solving this question Programming Problems and Competitions :: HackerRank but sometimes it gives me wrong answer. What is wrong with my approach?

I used suffix array and lcp array.

We know that any prefix of a suffix is a substring of the string itself, so after finding the suffix array and the lcp array I did the following:

I choose two suffixes say i & j, suppose suffix i smaller than j.

Their lcp(i, j) will tell me the number of chars in common until I hit a mismatch or the suffix ends.

if the suffix ends then we r done the number of accepted pairs is the lcp itself.
otherwise if the mismatch doesn’t exceed k then we add lcp(i, j) + 1 to the result, then I start again with new suffixes.

I’ll give an example to make it more clear:

suffix i: abcdef
suffix j: abceeeefff
suppose k is 1

lcp(i, j) = 3 & the first mismatch is d & e. this mismatch doesn’t exceed k so res += (3 + 1) then new suffixes are:

suffix i: ef
suffix j: eeefff

Notice how I removed the lcp and the mismatch.

now lcp(i, j) = 1 & mismatch is f & e but the mismatch will now exceed k so we just count lcp thus res += 1 which yields res = 5

Here is a function that finds solution for i & j;

LL solve(int i, int j, int n, int k) {

int x = n - pos[i]; // Length of suffix i

int y = n - pos[j]; // Length of suffix j

// Put shorter suffix in x and longer in y
if(x > y) swap(x, y), swap(i, j);
LL ret = 0, m = 0;

while(x > 0) {
ret += lcp1[i][j];

// if all suffix i is a match break otherwise we have a mismatch
if(x == lcp1[i][j]) break;

// if this mismatch excceeds k then break otherwise count it
if((m + 1) > k) break;

m++;

// remove the matched chars and mismatched char
x -= (lcp1[i][j] + 1); 
y -= (lcp1[i][j] + 1);

// update the position of the new suffixes in the suffix array.
// rank returns the position of suffix starting at position pos[i] + lcp1[i][j] + 1 in //the suffix array
i = rank[pos[i] + lcp1[i][j] + 1];
j = rank[pos[j] + lcp1[i][j] + 1];

}

return ret + m; // return matches + mismatches

}

arnabbcs08 · August 16, 2013, 9:24pm

Use DP for this one. Much easier to implement.

bpir · August 17, 2013, 3:47am

Business Excellence Assessments
Business Excellence tools
Baldrige
Business Performance Improvement

bondoc73 · August 17, 2013, 1:10am

can you tell me where is the mistake in this solution?