Editorialist: Rohit Mazumder
PROBLEM LINK:
PROBLEM:
We are given the Burrows-Wheeler Transform (BWT)[ Burrows–Wheeler transform - Wikipedia] of a binary string, and we have to reconstruct the first row of the sorted array.
BACKGROUND:
The technique of sorting all the circular shifts of a text in a lexicographic order and then extracting the last column, describes the Burrows-Wheeler Transform of a string, which is usually used to make a string easy to compress. The following article however presumes no previous knowledge of Burrows-Wheeler Transform.
EXPLANATION:
To understand the solution to this question, we will start by re-doing the given example in the question, with a slight modification. Instead of considering it as a binary string, lets for consider for a moment that each character is a unique letter. We do so by putting a subscript on each of the character. The subscript i on a character corresponds to the ith occurrence of the character in the string (i.e. the rank of the character!)
So now we have to transform the string 01 02 11 12 03
Initial Array | Sorted Array | Rightmost column |
---|---|---|
01 02 11 12 03 | 03 01 02 11 12 | 12 |
02 11 12 03 01 | 01 02 11 12 03 | 03 |
11 12 03 01 02 | 02 11 12 03 01 | 01 |
12 03 01 02 11 | 12 03 01 02 11 | 11 |
03 01 02 11 12 | 11 12 03 01 02 | 02 |
Table 1: BWT of modified string
The transformed string is: 12 03 01 11 02
We’ll make some crucial observations now,
- Since we rotate the string by one place in every row, the number of rows is same as the length of the given strings and the each of the unique characters in the initial string occur only once in the final string formed out of the rightmost column. Hence, from the transformed string, we can deduce the frequency of 0’s and 1’s in the original string!
- In the sorted array, the rows were sorted lexicographically, this results in all the 0’s followed by all the 1’s in the first column of the sorted array. Hence, we can deduce the leftmost column of the sorted array from the given rightmost row!
- On closely observing the first and the last column in the sorted array, we find that the relative ordering of 0’s is same in both of them (03 01 02). Same goes for the 1’s.
Sorted Array
03 01 02 11 12
01 02 11 12 03
02 11 12 03 01
12 03 01 02 11
11 12 03 01 02
Table 2: Sorted array from table 1
That being said, we can rank the characters of the initial string in such a manner, that the ranks occur in a sorted order.
First Column | Last Column |
---|---|
01 | 11 |
02 | 01 |
03 | 02 |
11 | 12 |
12 | 03 |
Table 3: First Column and Last Column of sorted array with modified ranks
Constructing the rest of the string:
We will construct the rest of the string from the knowledge of the first and last column.
Since, the rows are formed by rotating the string, the character in the last column is the character which must occur before character in the first column of the corresponding row.
So 11 occurs before 01
01 occurs before 02
02 occurs before 03
03 occurs before 12
12 occurs before 11
Since the topmost row starts with a 01 and ends in 11(Table 3-first row).
Using the cycle above, we obtain, the topmost row: 01 02 03 12 11
Answer: 0 0 0 1 1.
Subtask 1:
S = 1 0 1 1 1 1 0 0
First Column | Last Column |
---|---|
0_1 | 1_1 |
0_2 | 0_1 |
0_3 | 1_2 |
1_1 | 1_3 |
1_2 | 1_4 |
1_3 | 1_5 |
1_4 | 0_2 |
1_5 | 0_3 |
Answer: 0 0 1 1 0 1 1 1
Subtask 2:
S = 1 0 1 1 0 1 1 1 0 0 1 0
First Column | Last Column |
---|---|
0_1 | 1_1 |
0_2 | 0_1 |
0_3 | 1_2 |
0_4 | 1_3 |
0_5 | 0_2 |
1_1 | 1_4 |
1_2 | 1_5 |
1_3 | 1_6 |
1_4 | 0_3 |
1_5 | 0_4 |
1_6 | 1_7 |
1_7 | 0_5 |
Answer: 0 0 0 1 1 1 0 1 1 0 1 1
Subtask 3:
S= 1 1 1 1 0 1 1 0 1 0 1 1 0 1 0
First Column | Last Column |
---|---|
0_1 | 1_1 |
0_2 | 1_2 |
0_3 | 1_3 |
0_4 | 1_4 |
0_5 | 0_1 |
1_1 | 1_5 |
1_2 | 1_6 |
1_3 | 0_2 |
1_4 | 1_7 |
1_5 | 0_3 |
1_6 | 1_8 |
1_7 | 1_9 |
1_8 | 0_4 |
1_9 | 1_10 |
1_10 | 0_5 |
Answer: 0 0 1 1 1 1 0 1 1 1 0 1 0 1 1