Is it possible to have all the rows distinct(unique) in a matrix having only 0 or 1 , after removing exactly one column ?

anu1234 · August 30, 2014, 6:00pm

Hi friends This question was asked in recent hiring challenge at hackerearth , that is over now,Please discuss your strategies.I am not able to devise algorithm please provide some hints to solve it.

Pulkit is really good at maths. Recently, he came to know about a problem on matrices. Amazed by the problem he got, he asked Ashish the same problem. Ashish also being good at maths solved the problem within 5 minutes. Now, its your time to solve the problem.

You will be given n*m binary matrix. You need to tell if it is possible to delete a column such that after deleting that column, rows of the matrix will be unique. If yes than print “Yes” else print “No”.

[Input]

First line contains an integer t denoting no.of test cases.

Next line contains 2 integers n and m denoting no.of rows and columns.

Next n line contains binary string of length m each.

[Output]

For each test case output “Yes” or “No”.

[Constraints]

1<=t<=100

1<=n<=1000

2<=m<=1000

Sample Input (Plaintext Link)

2

3 3

101

000

100

2 2

11

Sample Output (Plaintext Link)

Yes

No

thechamp103 · August 30, 2014, 8:20pm

I was only able to get 30 points.
I divided the rows into groups of 64 so that each can be taken into an integer and then continued with the question.

My solution link: cTjbhD - Online C++ Compiler & Debugging Tool - Ideone.com

ashish1610 · August 31, 2014, 12:59am

@anu1234 and @thechamp103 This was my problem.

The solution was just to check if all the rows are unique or not. If they are unique ans is always Yes, else ans is No.

Why so??

Now, suppose there are two or more rows which are same. Than no matter what column you delete those rows will still be same, so ans for this case is no.

Now, what if all rows are unique. You can always find some row which you can delete.

This is the link for the proof for the above statement.

kuruma · September 7, 2014, 5:58am

Since N and M are both <= 1000 wont brute force work here?

You simply attempt to remove every single column then compute an hash for each row, or insert it in a set or whatever and check the final size of the set…

Computing hash for each row takes O(M) time.

Removing each column takes O(M) time.

You will compute N hashes.

Complexity = O(M^2*N) = 10^9 on worst case which fits in 1 sec… Maybe with some optimizations like checking for equal rows while reading input can help a bit, but, this should run in under 1 sec I believe

anu1234 · August 30, 2014, 8:39pm

hi thechamp103 … thanx for ur reply. I dont know why there z so silence on codechef ??

superty · August 30, 2014, 9:23pm

What is the time limit?

thechamp103 · August 31, 2014, 12:15am

Actually you gotta wait 1-3 days before people answer and usually people asking the questions forget.So until unless you got a question related to codechef you should consider answer may take some time…and if you figure out a way to get 100 pts let me know.

anu1234 · August 31, 2014, 12:09pm

@Ashish1610 … I also thought the same logic but u think time limit you gave there was enough for worst case ??M, N were 1000 there so to check the duplicate rows complexity of program should be (1000*(1000-1)/2)(1000)= 10^9 approx. Is there any more efficient way ?

ashish1610 · August 31, 2014, 3:50pm

You can use map or set to solve this. Just insert the string to set or create map. If in map you have already seen the row return false else return true. If in set the size of set is less than n return false else return true.

vivekbrnl · September 6, 2014, 10:38pm

@Ashish1610 … I think your solution will not work for the following input.

101
001
100
110

Here, all the rows are unique. If you delete any column, they will not be unique any more.
So, If all the rows are unique then the ans is not always yes.

xellos0 · September 7, 2014, 3:46pm

Right, hashing is the simplest way to do this.

In fact, it’s O(MN log N), because you can pre-compute hashes of all prefixes, suffixes of rows and combine them for each removed column to get the hashes of N rows after this column is removed. Uniqueness can be checked after quicksorting.

vivekbrnl · September 7, 2014, 7:46pm

@xellos0 … are you talking about unique hash generation? If yes, which hashing algorithm you are talking about?

xellos0 · September 7, 2014, 8:25pm

For 1000 elements, polynomial hashing modulo roughly 10^6 would almost definitely generate unique hashes.

(If you want to make extra sure, you can hash mod 10^9+7 and if it’s extra extra extra sure, a pair of hashes mod 10^9+7 and 10^9+9.)

secretkeeper · September 7, 2014, 11:32pm

@xellos0 in the above comments you said pre-compute hashes of all prefixes and suffixes of rows and applying quick sorting. can you tell how to do that in brief .

xellos0 · September 8, 2014, 12:43am

Uhh… just google quicksort, polynomial hash and precomputation. I’m sure you can find some editorials, because these are very basic concepts.