PTNQLT - Editorial

samhenry97 · May 17, 2018, 9:36pm

PROBLEM LINK:

Practice

Contest

Author: samhenry97
Tester: ncollins
Editorialist: samhenry97

DIFFICULTY:

Medium-Hard

PREREQUISITES:

Suffix Arrays, LCP Arrays

PROBLEM:

Given a grid of integers, determine how many rows and columns have patterns of any length p, where a pattern is a common subarray within the row or column.

QUICK EXPLANATION:

For each row and column, use an algorithm like Manber & Myers to compute the suffix array and LCP array. Then, get the maximum value of the LCP array, and keep a count of the maximums for every row and column. Lastly, we just need to print out the results, summing them up.

EXPLANATION:

Overview

The problem comes down to computing the longest common substring within a string itself. The brute force approach should immediately be thrown out, as the time complexity is very large. There are two main approaches for finding the results. The editorial solution runs overall in logarithmic time, but the optimal solution would run in linear time.

Solution

One way to compute the longest common substring in a string itself is to use suffix arrays and longest common prefix arrays. We’ll use SA and LCP to refer to these from here out.

First, we run through all of the rows and columns, computing the largest LCP value. LCP values mean that in the sorted subarrays (SA), the value of the next lexicographic subarray contains LCP[i] same prefix characters. Taking the maximum LCP value will give us the data we need for the answer.

The majority of the problem comes down to computing the SA and LCP. We can use many algorithms for this. To solve this problem, a logarithmic or linear algorithm is necessary. A log^2 algorithm will not work (unoptimized Kasai). We can use an optimized Kasai or Manber & Myers for a logarithmic solution. The DC3 algorithm is great for a linear solution, but because of the alphabet size, it would be hard to implement. Once the SA is computed, we can compute the LCP in linear time.

Now, using the max LCP values for each row and column, we can compute how many rows and columns meet the requirement, >= p, at each p. We traverse backwards through the maximum LCP values, and keep an accumulator. Then, we iterate through the array we just computed, and print the answer.

There are other ways of computing the answer, but only a logarithmic solution would work.

ALTERNATIVE SOLUTION:

Another way of solving this problem would be to use a suffix tree. Because the alphabet size is so large, it would be very difficult to construct such a tree. If done correctly, the problem could be solved in linear time.