Trie – Prefix Tree

 

  1. Trie Youtube Video and Java Code(add, search, delete) Gist(add, search, frequency).

  2. I’m making a search engine called MillionGazillion from Interview Cake. 
    Algo:
    1. We can store the data using a hash table mapping. This though takes a lot of space.  To prevent that we use a Trie(though it has it’s own memory issues but hey it’s better here).
    2. We can also use Bloom Filter here to check if a url has already been indexed.


  3. Find last unique URL from long list of URLs in single traversal. 
    Algo: This can be done in a number of ways.
    1. The implementation in GFG is with trie with doubly linked list.
    2. We can implement the same with hash as well. Actually since we are not using the prefix properties of URLs I would go for Hashing.


  4. Find shortest unique prefix for every word in a given list
    Algo: This is a good candidate for Trie.


  5. This is not a question actually. The links shows how to use a different DS within HashMap in Java. This link will be useful to make the changes in the current HashMap. HashMap: One Key, multiple Values

  6. Find Word Positions in Text –  Given a text file and a word, find the positions that the word occurs in the file. We’ll be asked to find the positions of many words in the same file.
    Algo: The problem is like creating a inverted index for a search
    1. We can use a HashMap here.
    2. Hash table will use more memory and provides for only exact match. Trie will take less memory and provides for partial search as well.


  7. Word formation using concatenation of two dictionary words – Given a dictionary find out if given word can be made by two words in the dictionary.
    Algo: We will use trie here. We first search for the first half of the word. If we find the partial match we start from the root again for the remaining part of the word.


  8. Given a dictionary and a character array, print all valid words that are possible using characters from the array.  GFG
    Algo: The idea is to use Trie data structure to store dictionary, then search words in Trie using characters of given array.

    1. Create an empty Trie and insert all words of given dictionary into the Trie.
    2. After that, we have pick only those characters in ‘Arr[]’ which are a child of the root of Trie.
    3. To quickly find whether a character is present in array or not, we create a hash of character arrays.

  9. [Pending]***** How to find list of possible words from a letter matrix [Boggle Solver] Geeksforgeek Video

  10. Minimum XOR Value Pair  – Code
    Algo:
    1. Sort the numbers.
    2. Calculate bit-wise xor for every adjacent pair in the sorted array.
    3. Output the minimum of these xor you calculated.
    Proof: If answer is a xor b then there does not exist c such that a<=c<=b.


  11. Longest prefix matching – Given a dictionary of words and an input string, find the longest prefix of the string which is also a word in dictionary. GFG – Code
    Algo: We build a Trie of all dictionary words. Once the Trie is built, traverse through it using characters of input string. If prefix matches a dictionary word, store current length and look for a longer match. Finally, return the longest match.


  12. *** Longest Common Prefix (with Trie)Code  GFG
    Algo: We are using a trie here. Index all the words in the Trie. The first node where the nodes diverge will be the common prefix.
    Start with the rootTrieNode. Check if has only on key in the HashMap(if implement Trie using HashMap) and that theTrieNode is not marked as a endOfWord. If not,  then we need to goto the next TrieNode. Keep track of the character encountered during the traversal.
    Time Complexity : Inserting all the words in the trie takes O(MN) time and performing a walk on the trie takes O(M) time, where N = Number of strings M = Length of the largest string string
    Auxiliary Space: To store all the strings we need to allocate O(26*M*N) ~ O(MN) space for the Trie.
    Another Approach with same complexity. No extra space. We do a character by character matching here. Find the smallest string as the common prefix can’t be larger then that. Then start checking each character of every string. If all the string at index i has the same char, then proceed to the next char else break.


  13. Find Duplicate rows in a binary matrix –
    Algo:
    1. Insert each row in the trie and keep checking if two row end at the same point.
    2. Convert the binary row into decimal and insert it into a Map and check if it already exists.


  14. Print all words matching a pattern in CamelCase Notation Dictionary –
    Algo: So the idea here is to store all the Capital Characters Only. Then how do we retrieve the original word ? We store the original word in the last leaf node!!. Now this tweaking of the original Trie structure can be really helpful. If we didn’t use that we would have had to maintain a new Hash table.


  15. ***** Implement a phone directory using Trie /  Autocomplete   

  16. [Pending] [ Suffix Trie] Count of distinct substrings of a string – this can be solved in linear time using a suffix tree. We are not covering suffix tree as of now. But it’s good to know such an algo exists.

  17. Given a huge set of strings with duplicate strings present, find the maximum occurring word in it. If two words have same count, return any one of them.
    Same as this
    Algo:
    1. The idea is to use Trie to solve this problem. We start by inserting each key into the trie and store its count so far in the leaf nodes. While inserting we maintain a count of the max word inserted.
    2. We can convert the string into a unique hash(using there ASCII and position). Now we insert these number into a max heap. In the node we also store the string. The max-heap will maintain the most frequent word on the top. This way we can keep track of the next frequent word one the first one is popped.
    3. Hash all words one by one in a hash table. If a word is already present, then increment its count. Finally, traverse through the hash table and return the k words with maximum counts.


  18. Skip – Palindrome pair in an array of words (or strings)

  19. Skip – Lexicographic sorting of given set of keys in Trie

  20. Given a sequence of words, print all anagrams together.
    For example, if given word sequence is –
    [“car”, “ape”, “meal”, “pea”, “male”, “arc”, “lame”, “dog”]
    Then output of the program should be –
    (car, arc) , (ape, pea), (lame, meal, male) dog
    Order of groups and order of group members between themselves could be different but group members must be same.
    Link
    Algo:
    We will sort each word before inserting into the Trie. So all the anagrams will end at the same last TrieNode. We store these last TrieNode in the HashMap.
    Either the last TrieNode or the Map can contain the original words. 
    The time complexity of this algorithm is O(m.nlogn) where m is total number of words in given sequence and n is the average length of each word in given sequence.


Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s