How to find k most frequent words in a file?

what is the most optimum way to find k most occurring words in the file.

hello arpit , considering the input to be a string of length L which consists of space separated words . so basically there are 2-3 ways which i am able to think of that can be used to do so.

  1. Trie : you can make a dynamic trie to store each word along with a counter on them , simply add them and increase the counter. O(L) algorithm , can be implemented carefully with o(L) space.
  2. inbuilt hash (unordered_maps) : you can hash every word and store them in a key,value pair map . well, this will take O(kL) where k is actually a factor sometimes considered to nearly logn . O(L) memory.
  3. explicitly hash : well this is same as above but you can select your own hash function depending upon the input and again to hash them you need a storage , which can be either maps (again logn factor) or trie .

These ways are bounded by my knowledge , would be more than happy to know if there exists a better one .

1 Like

@arunnsit

Can you provide links to some tutorials on trie.

Below down at the page Trie solution