- I’m making a search engine called MillionGazillion from Interview Cake.
First we can use a hash table. This though takes a lot of space. To prevent that we use a Trie(though it has it’s own memory issues but hey it’s better here). One way would be to use Bloom Filter.
- Find last unique URL from long list of URLs in single traversal. – trie with doubly linked list
- Find shortest unique prefix for every word in a given list –
- Need to maintain the frequency of each node in the Trie. This link will be useful to make the changes in the current HashMap.
HashMap: One Key, multiple Values
- Find Word Positions in Text – Given a text file and a word, find the positions that the word occurs in the file. We’ll be asked to find the positions of many words in the same file.
Either we can use a Hash Table or Trie. Hash table will use more memory and provides for only exact match.
Trie will take less memory and provides for partial search as well.
- Word formation using concatenation of two dictionary words – Given a dictionary find out if given word can be made by two words in the dictionary.
- Given a dictionary and a character array, print all valid words that are possible using characters from the array. –
The idea is to use Trie data structure to store dictionary, then search words in Trie using characters of given array.
- Create an empty Trie and insert all words of given dictionary into the Trie.
- After that, we have pick only those characters in ‘Arr’ which are a child of the root of Trie.
- To quickly find whether a character is present in array or not, we create a hash of character arrays.
- How to find list of possible words from a letter matrix [Boggle Solver] Geeksforgeek Video
- Minimum XOR Value Pair -1. Sort the numbers.
2. Calculate bit-wise xor for every adjacent pair in the sorted array.
3. Output the minimum of these xor you calculated.
If answer is a xor b then there does not exist c such that a<=c<=b.
- Longest prefix matching – A Trie based solution in Java – Code
- Longest Common Prefix – Gist using Trie
- Find Duplicate rows in a binary matrix –
We can use Trie or solve it otherwise also. Make a binary trie. Or convert the binary row into decimal and insert it into a Map/Trie and check if it already exists.
- Print all words matching a pattern in CamelCase Notation Dictionary –
So the idea here is to store all the Capital Characters Only. Then how do we retrieve the original word ? We store the original word in the last leaf node!! (define a new variable along with end of word). Now this tweaking of the original Trie structure can be really helpful. If we didn’t use that we would have had to maintain a new Hash table.
- Implement a phone directory using Trie / Autocomplete
- Pending[Min/Max Heap] Find the k most frequent words from a file. The Trie part is simple.
- Pending[ Suffix Trie] Count of distinct substrings of a string – this can be solved in linear time using a suffix tree. We are not covering suffix tree as of now. But it’s good to know such an algo exists.
- Pending[TREE] – Given a huge set of strings with duplicate strings present, find the maximum occurring word in it. If two words have same count, return any one of them. – The idea is to use Trie (Prefix Tree) to solve this problem. We start by inserting each key into the trie and store its count so far (along with the key itself) in the leaf nodes. After all nodes are inserted into the trie, we perform its pre-order traversal (depth first search) and find maximum frequency word by comparing the count present at leaf nodes.
Note that we can also use a map to solve this problem.
- Skip – Palindrome pair in an array of words (or strings)
- Skip – Lexicographic sorting of given set of keys in Trie – Another link
- Find shortest unique prefix for every word in a given list. GFG
- Given a sequence of words, print all anagrams together, For example, if given word sequence is [“car”, “ape”, “meal”, “pea”, “male”, “arc”, “lame”, “dog”] then output of the program should be –
(car, arc) , (ape, pea), (lame, meal, male) dog
Order of groups and order of group members between themselves could be different but group members must be same.
First we sort it according to characters and insert into the trie. In the end node we maintain we keep a track of the all indexes in the main array. The time complexity of this algorithm is O(m.nlogn) where m is total number of words in given sequence and n is the average length of each word in given sequence.