Hi friends This question was asked in recent hiring challenge at hackerearth , that is over now,Please discuss your strategies.I am not able to devise algorithm please provide some hints to solve it.
Pulkit is really good at maths. Recently, he came to know about a problem on matrices. Amazed by the problem he got, he asked Ashish the same problem. Ashish also being good at maths solved the problem within 5 minutes. Now, its your time to solve the problem.
You will be given n*m binary matrix. You need to tell if it is possible to delete a column such that after deleting that column, rows of the matrix will be unique. If yes than print “Yes” else print “No”.
[Input]
First line contains an integer t denoting no.of test cases.
Next line contains 2 integers n and m denoting no.of rows and columns.
Next n line contains binary string of length m each.
The solution was just to check if all the rows are unique or not. If they are unique ans is always Yes, else ans is No.
Why so??
Now, suppose there are two or more rows which are same. Than no matter what column you delete those rows will still be same, so ans for this case is no.
Now, what if all rows are unique. You can always find some row which you can delete.
This is the link for the proof for the above statement.
Since N and M are both <= 1000 wont brute force work here?
You simply attempt to remove every single column then compute an hash for each row, or insert it in a set or whatever and check the final size of the set…
Computing hash for each row takes O(M) time.
Removing each column takes O(M) time.
You will compute N hashes.
Complexity = O(M^2*N) = 10^9 on worst case which fits in 1 sec… Maybe with some optimizations like checking for equal rows while reading input can help a bit, but, this should run in under 1 sec I believe
Actually you gotta wait 1-3 days before people answer and usually people asking the questions forget.So until unless you got a question related to codechef you should consider answer may take some time…and if you figure out a way to get 100 pts let me know.
@Ashish1610 … I also thought the same logic but u think time limit you gave there was enough for worst case ??M, N were 1000 there so to check the duplicate rows complexity of program should be (1000*(1000-1)/2)(1000)= 10^9 approx. Is there any more efficient way ?
You can use map or set to solve this. Just insert the string to set or create map. If in map you have already seen the row return false else return true. If in set the size of set is less than n return false else return true.
@Ashish1610 … I think your solution will not work for the following input.
101
001
100
110
Here, all the rows are unique. If you delete any column, they will not be unique any more.
So, If all the rows are unique then the ans is not always yes.
In fact, it’s O(MN log N), because you can pre-compute hashes of all prefixes, suffixes of rows and combine them for each removed column to get the hashes of N rows after this column is removed. Uniqueness can be checked after quicksorting.
@xellos0 in the above comments you said pre-compute hashes of all prefixes and suffixes of rows and applying quick sorting. can you tell how to do that in brief .