ZIO11004 - Editorial

Editorialist: Rohit Mazumder

PROBLEM LINK:

PROBLEM:

We are given the Burrows-Wheeler Transform (BWT)[ Burrows–Wheeler transform - Wikipedia] of a binary string, and we have to reconstruct the first row of the sorted array.

BACKGROUND:

The technique of sorting all the circular shifts of a text in a lexicographic order and then extracting the last column, describes the Burrows-Wheeler Transform of a string, which is usually used to make a string easy to compress. The following article however presumes no previous knowledge of Burrows-Wheeler Transform.

EXPLANATION:

To understand the solution to this question, we will start by re-doing the given example in the question, with a slight modification. Instead of considering it as a binary string, lets for consider for a moment that each character is a unique letter. We do so by putting a subscript on each of the character. The subscript i on a character corresponds to the ith occurrence of the character in the string (i.e. the rank of the character!)

So now we have to transform the string 01 02 11 12 03

Initial Array Sorted Array Rightmost column
01 02 11 12 03 03 01 02 11 12 12
02 11 12 03 01 01 02 11 12 03 03
11 12 03 01 02 02 11 12 03 01 01
12 03 01 02 11 12 03 01 02 11 11
03 01 02 11 12 11 12 03 01 02 02

Table 1: BWT of modified string

The transformed string is: 12 03 01 11 02

We’ll make some crucial observations now,

  1. Since we rotate the string by one place in every row, the number of rows is same as the length of the given strings and the each of the unique characters in the initial string occur only once in the final string formed out of the rightmost column. Hence, from the transformed string, we can deduce the frequency of 0’s and 1’s in the original string!
  2. In the sorted array, the rows were sorted lexicographically, this results in all the 0’s followed by all the 1’s in the first column of the sorted array. Hence, we can deduce the leftmost column of the sorted array from the given rightmost row!
  3. On closely observing the first and the last column in the sorted array, we find that the relative ordering of 0’s is same in both of them (03 01 02). Same goes for the 1’s.

Sorted Array
03 01 02 11 12
01 02 11 12 03
02 11 12 03 01
12 03 01 02 11
11 12 03 01 02

Table 2: Sorted array from table 1

That being said, we can rank the characters of the initial string in such a manner, that the ranks occur in a sorted order.

First Column Last Column
01 11
02 01
03 02
11 12
12 03

Table 3: First Column and Last Column of sorted array with modified ranks

Constructing the rest of the string:

We will construct the rest of the string from the knowledge of the first and last column.

Since, the rows are formed by rotating the string, the character in the last column is the character which must occur before character in the first column of the corresponding row.

So 11 occurs before 01

01 occurs before 02

02 occurs before 03

03 occurs before 12

12 occurs before 11

ZIO11041.jpg

Since the topmost row starts with a 01 and ends in 11(Table 3-first row).

Using the cycle above, we obtain, the topmost row: 01 02 03 12 11

Answer: 0 0 0 1 1.

Subtask 1:

S = 1 0 1 1 1 1 0 0

First Column Last Column
0_1 1_1
0_2 0_1
0_3 1_2
1_1 1_3
1_2 1_4
1_3 1_5
1_4 0_2
1_5 0_3

ZIO11042.jpg

Answer: 0 0 1 1 0 1 1 1

Subtask 2:

S = 1 0 1 1 0 1 1 1 0 0 1 0

First Column Last Column
0_1 1_1
0_2 0_1
0_3 1_2
0_4 1_3
0_5 0_2
1_1 1_4
1_2 1_5
1_3 1_6
1_4 0_3
1_5 0_4
1_6 1_7
1_7 0_5

ZIO11043.jpg

Answer: 0 0 0 1 1 1 0 1 1 0 1 1

Subtask 3:

S= 1 1 1 1 0 1 1 0 1 0 1 1 0 1 0

First Column Last Column
0_1 1_1
0_2 1_2
0_3 1_3
0_4 1_4
0_5 0_1
1_1 1_5
1_2 1_6
1_3 0_2
1_4 1_7
1_5 0_3
1_6 1_8
1_7 1_9
1_8 0_4
1_9 1_10
1_10 0_5

ZIO11044.jpg

Answer: 0 0 1 1 1 1 0 1 1 1 0 1 0 1 1

3 Likes

How did you make this table ? I dont understand what first and last column means in this table, how did you rank ? Can you clarify ?

In this table, the last column refers to the last column of the BWT matrix (the binary string provided as an input to you in the question); the first column similarly, refers to the first column of the BWT matrix. In the question, we have been provided with the last column. Using the last column we can very easily deduce the first column using the 3 crucial observations we made above!

Maybe I am missing something, but the first column of the sorted matrix is different from what you have written in the table.