Problem LinkAuthor: Stacy Hong Tester: Misha Chorniy Editorialist: Bhuvnesh Jain DifficultyMEDIUM PrerequisitesProblemYou are given a grid with numbers written on every cell. You are required to find the largest connected component on the grid such that it contains at most 2 distinct numbers in it. Note that editorial refers to the flowers as numbers. ExplanationSubtask 1: N, M ≤ 100A simple brute force which considers every pair of different numbers and performs BFS on the grid to find the connected component and their sizes is enough to pass this subtask. Below is a pseudocode for it.
The time complexity of the above approach is $O(n * m * {\text{Max(a[i][j])}}^{2})$. Subtask 2, 3: N, Q ≤ 2000The solution approach for last 2 subtasks is same. Just difference in implementation and constant factors can lead to passing subtask 2 only. The full solution idea relies on the brute force solution above. Instead of doing a complete BFS on the grid every time based on the type of colour you need to consider, we store the edges containing different numbers and do BFS/DSU on it. Below are the 2 different approaches for it. Author/Tester Solution: DSU + Sorting/HashMapFirst find all connected components which have the same type of numbers. Store with each component the size of the component and the index of the component as well. This can be easily done with a BFS/DFS. Now create edges between 2 component which are adjacent to each other in the grid i.e. there exists a cell in component $1$ which is adjacent to a cell in component $2$. Store the edges in the following format: $$\text{{Number in component 1, Number in component 2, Index of component 1, Index of component 2}}$$ To avoid doing the same computation again, store only the edges such that $\text{Number in component 1} < \text{Number in component 2}$. This will help to reduce the constant factor by 2 in your code. Now, we traverse each edge based on the numbers they connect and perform DSU to find the maximum connected component. Below is the explanation of sample test case in the problem. The left side image shows the different components formed after the first BFS is done. The second side image shows the result of performing DSU. So, once the BFS is done, we see that cell $(2, 1)$ containing $3$ is adjacent to cell $(1, 1)$ containing $1$, so we merge them using DSU. Again cell $(4, 1)$ containing $1$ is adjacent to cell $(4, 2)$ containing $3$ so, we merge them using DSU. The process continues as cell $(4, 3)$ containing $1$ is adjacent to $(4, 2)$ containing $3$ (which was already expanded before) and the component increases. The only issue which needs to be taken care while implementing the above solution is resetting of DSU after every iteration. Doing it in naive manner i.e. resetting all the components to original sizes and parent as themselves will TLE even for subtask 2 as the complexity will be similar to subtask 1. Instead, we only restore the values whose parent or size changed while performing the DSU operation. This can be stored while performing the merge operation. For more details, you can refer to the author's or tester's solution below. The complexity of the above approach will be $O(n * m * log(n * m))$ as you will traverse each edge once. The logarithmic factor is due to sorting or using maps to store the edges. If you use hashmaps instead to store the edges, the complexity will be $O(n * m)$ but will large constant factor. Editorialist Solution: Edgebased Graph traversalUPDATE: This solution is slow. The author created a counter test against it. The time complexity will be $O(n^2*m^2)$ as pointed out in comments below. Even the test case provided is similar. Please refer to author's solution above for full problem. Will ask Codechef team to add that case in the practice section too for help. Instead of forming any components or performing DSU or doing BFS like the normal way, we see simple observation in the brute force approach. If, each edge connecting adjacent cells in the grid, whether having same numbers or different numbers if traversed at most twice while determining the size, the complexity will be $O(8 * n * m)$ overall. Below is the detailed description of the idea: First, select an edge in the grid. Find the numbers (at most 2) which will form the component. We will try to find the full component which contains this edge and make sure this component is traversed only once. For this, you start a normal BFS from each cell. You go to adjacent cell only if it is one of the 2 required numbers. While adding the node in the BFS queue, we all update the possible ways in which the edge could be traversed as part of the same component. So, cell $(3, 2)$ containing $2$ was visited from cell $(2, 2)$ containing $1$. But it could be traversed from cell $(3, 3)$ containing $1$ in future and will form the same component. So, we update all 4 directions of a cell can be visited in the same component before adding the cell in BFS queue. The BFS for a component is performed only if the edge was never traversed before in any component. The image below shows how the component is expanded in each step of BFS when the starting edge is selected as $(1, 2)$ and $(1, 3)$. The time complexity of the above approach is $O(n^2*m^2)$ and will pass subtask 1 only. Once, you are clear with the above idea, you can see the editorialist implementation below for help. Feel free to share your approach, if it was somewhat different. Time Complexity$O(n * m)$ Space Complexity$O(n * m)$ AUTHOR'S AND TESTER'S SOLUTIONS:Author's solution can be found here. Tester's solution can be found here. Editorialist's solution can be found here. (Will TLE).
This question is marked "community wiki".
asked 10 Jun, 13:17

I used the method outlined in the editorialist's solution and got AC: https://www.codechef.com/viewsolution/18766440 But then I thought of this test case:
There's a 1 connecting each of the rows of 1s, so the entire sea of 1's should be traversed for each edge that we start a BFS on, right? That would make this worst case O(N^2 * M^2)? Maybe I'm missing something here, let me know if it is so. Thanks! answered 11 Jun, 16:39
I didn't do anything special to handle such a case, and when I tested my solution on a test case constructed like this with N and M both in the order of 1000, it ran for more than two minutes and I had to quit it eventually.. So I'm pretty sure my solution is N^2*M^2. Yet, it passed with 100 points in the contest which may mean the test cases were weak?
(11 Jun, 17:58)
Update: You are correct @akamoha. The test cases were weak. Thanks for a similar test case. Have updated the editorial with a warning too.
(12 Jun, 08:32)

If Question level is Medium then why added to practice easy section instead of adding to practice medium... answered 11 Jun, 18:03

@likecs: I think there should be a correction in the pseudo code given for the first subtask. I think it should be:
Correct me if I'm wrong. But I think this is correct. answered 14 Jun, 13:51

Editorialist solution was the best method. I used it in the competition and it took me just 0.7 sec to pass the question with a AC. Link: https://www.codechef.com/viewsolution/18816600link text Please accept if considered correct. :) answered 11 Jun, 16:13

i have seen editorialist solution https://s3.amazonaws.com/codechef_shared/download/Solutions/JUNE18/editorialist/TWOFL.cpp lets take test case 1 1 1 1 1 1 1 1 1 if we consider this type of test case, Time complexity will lead to N.N.M or even to M^2.N^2 so i think we cant exactly say complexity for this solution is N.M answered 12 Jun, 09:00
This pointed out in the editorial. Please check again.
(12 Jun, 09:51)
ya, seen that, thank you!!
(12 Jun, 10:16)

I have a method to share but,the problem with my method was that it only fetched me 40 points,it did not pass the last test cases. But, I tried my best to optimize my code as much possible. Method : During input I counted frequency of every elements in matrix. Then I created 2 maps xyz & ank. xyz map stored the cumulative count of pair combination and ank is for marking visited cell for particular pair. Example take pair of (1,2) and let frequency of 1 be 10 and frequency of 2 be 20. So when first time path of (1,2) will occur xyz will store the count of (1,2). Suppose in first path (1,2) contains 15 elements. So, when next time (1,2) path is possible then it will check frequency(1+2)  xyz(1,2) > tot.Here, tot is maximum count variable which will keep getting updated throughout the execution.If condition satisfies then only it will do DFS traversal. In above @akamoha test case freq of 1=50 and other elements =1. So, count for (1,2)=51. But now it will not do DFS traversal for any other pair as frequency condition won't satisfy. It's a type of brute force but I will highly appreciate if anyone can tell where to optimize my code. My method = https://www.codechef.com/viewsolution/18869954 answered 12 Jun, 20:18

I did not use BFS/DFS  just DSU for the set of all the cells in the field along with subset sizes. First, join all same type pairs of neighbors and save a copy of the resulting data. Then do the following  break the set of all the (unordered) pairs of different neighbor types into subsets in such a way that no two pairs in a subset have a common type. For each subset maintain a hash set of all the types in it and a list of index pairs of neighbors. Maintain a hash map mapping a pair of types into a subset. For each pair of different type neighbors add their indices to the corresponding list. The first time a pair of types is encountered find a subset that contains neither of the two types. If no such subset exists create a new one. For each of the resulting lists perform union of all the pairs in it restoring to the saved state before processing each list. Maintain the maximal size when performing unions. Here is python (PyPy) solution. UPDATE. This solution got AC, but it will be slow for a case when there is a type with a large number of different neighbor types. Instead we can simply maintain a hash map of separate lists for each pair of neighbor types, and at each DSU iteration restore only those elements that have been changed, as described in the editorial. answered 13 Jun, 09:27

I used just dfs and it fetched me 80 points...i was getting tle in one of the test cases of 2nd subtask. In dfs , what i did was...i just fixed one number and counted the connecting components of all other numbers ansd the fixed number connected to it which are greater (maintaining each ones count in a map) ...thus i was able to count each pair's connected component in just one dfs run . Also i have used many optimisations to prevent tles ... but still i was getting one tle . However , finally using some conditions (which i thought could have been the reason for the tle )...i was able to get 100 points (though i am not satisfied by my approach). But still i want to know why i was getting tle in the 2nd subtask's 3rd test case ...if anyone could help :) Here is my solution . answered 13 Jun, 12:10

i converted editorialist's solution to java Code.. but in java same code is giving TLE.. Here is link: https://www.codechef.com/viewsolution/18880190