Hashing Question List

InterviewBit Imp Problem Set  GeeksForGeek Problem Set   and https://leetcode.com/tag/hash-table/
*** Java HashMap, HashSet. when to use what and how to use them.
HashSet contains only values whereas HashMap contains entry(key and value).

Most of hashing questions are formed in association with String and Array.

  1. HashMap Basic Operations  Iterator
    Map<Integer, Integer> dataSet = new HashMap<Integer, Integer>();
    dataSet.get(), dataSet.put(), dataSet.containsKey(), dataSet.size()
    for (Map.Entry<Integer,Integer> entry: dataSet.entrySet())
    entry.getKey(), entry.getValue()


  2. HashSet
    HashSet<Integer> set = new HashSet<Integer>();
    set.contains(), set.add(), set.remove()


  3. **** Design a data structure that supports insert, delete, search and getRandom in constant time –  Uber.  Blog     GeeksForGeeks
    Algo:  We can use hashing to support first 3 operations in Θ(1) time. How to do the 4th operation? The idea is to use a resizable array (ArrayList in Java) together with hashing. Resizable arrays support insert in Θ(1) amortized time complexity. To implement getRandom(), we can simply pick a random number from 0 to size-1 (size is number of current elements) and return the element at that index. The hash map stores array values as keys and array indexes as values.
    – Whenever a value is added, its added to the array and it’s  index is added to hash map.
    – Whenever a value is removed we remove it from the array and then delete it from hashmap. Since we want O(1) we can’t shift the array element. So we get the last array element and add it to the index from where the value was removed and update this last element that was shifted index in hashmap.
    – Random search is simple by generating a index within a range of array size.
    – Search is simple hashmap lookup


  4. Find whether an array is subset of another array OR How to check if two given sets are disjoint? –  Given two arrays: arr1[0..m-1] and arr2[0..n-1]. Find whether arr2[] is a subset of arr1[] or not. Both the arrays are not in sorted order. It may be assumed that elements in both array are distinct.Algo
    1. Brute Force O(n^2)
    2. Sort arr1. Then do a binary search for elements of arr2[] in arr1[].
    Complexity –  O(mLogm + nLogm)
    4. Use Hashing. O(n) time and O(n) space complexity.
    5. Sort both the array. Then get the intersection using two pointer method. O(mLogm + nLogn)


  5.  Union and Intersection of two Linked Lists  – Given two Linked Lists, create union and intersection lists that contain union and intersection of the elements present in the given lists. Order of elements in output lists doesn’t matter.
    Algo –
    1. Brute Force. O(mn)
    2. Use Sorting. Since these are linked list, use Merge Sort. Linearly scan both sorted lists to get the union and intersection. This step takes O(m + n) time.
    Overall, O(mLogm + nLogn)
    3. Hashing. Put all items of first linked list into HashMap. Now start looking up for each element of second list in this table.
    If element is found, move that element to the intersection list.
    If element not found, move it to union list.
    When all items are over in list, add all element of table to hash table to union list.


  6.  **  Link Given an array A[] and a number x, check for pair in A[] with sum as x.  given an array A[] of n numbers and another number x, determines whether or not there exist two elements in S whose sum is exactly x.
    Algo:
    1. Sorting  – Sort the list. Maintain two pointer, one at start and other at end. Keep adding number at both pointer and see if the sum is equal to x, if not then move the pointer accordingly. This is a standard logic used in many problem.
    Time Complexity: Depends on what sorting algorithm we use. If we use Merge Sort or Heap Sort then (-)(nlogn) in worst case. If we use Quick Sort then O(n^2) in worst case.
    Auxiliary Space : Again, depends on sorting algorithm. For example auxiliary space is O(n) for merge sort and O(1) for Heap Sort.
    2. Hashing – Save all the number in HashMap. Now loop through the array. Search for the number, x-a[i] in table. O(n) – time and space.


  7. ** GFG Given an unsorted array that may contain duplicates. Also given a number k which is smaller than size of array. Write a function that returns true if array contains duplicates within k distance.
    Algo
    1. Brute force. Complexity – O(nk)
    2. The idea is to keep adding element to the hash table, but keep only those elements that are within k distance from the current i.
    If  i>=k,  we remove (i-k)th element of the array from hashset
    This ensures that we are always searching within a distance of k.  This logic can be applied to other problems as well.


  8. Find Itinerary from a given list of tickets. 
    Algo:
    1. This can be using Topological Sort using graph in O(n).
    2. Use Hashing.  The idea is to first find the starting point. A starting point would never be on ‘to’ side of a ticket. Once we find the starting point, we can simply traverse the given map to print itinerary in order.
    We make two Hashmap – one to ->from and other, from->to.
    (We can also solve this based on the logic that the destination will appear only once and that too on the from side as it’s mentioned – ‘there is one ticket from every city except final destination’).
    Create two HashMap one for `to` and other `for` from. Find the starting point and then keep connecting that point.


  9. Given a dictionary that contains mapping of employee and his manager as a number of (employee, manager) pairs like below. Write a function to get no of employees under each manager in the hierarchy not just their direct reports. It may be assumed that an employee directly reports to only one manager. In the above dictionary the root node/ceo is listed as reporting to himself.  GFG 

    Algo: 
    Create a map which shows manager to emp mapping. Then do a lookup from the given empToManager map to get the count. Now we will have to do a recursive call to find all the emp under a given manager, “C” –> “A”, “B”, for this we will have to find the emp under A and B as well. “F” –> “C”, “E”. When doing lookup for F we will again have to do lookup for C and then A and B. So we will have these value once, i.e. we will save the count of emp under A. So this doesn’t have to be done recursively. This is a small optimization that is imp in the question.


  10. * Given an array of integers and a number k, write a function that returns true if given array can be divided into pairs such that sum of every pair is divisible by k.
    Link  Complexity: O(n)
    Algo:
    When we look at the pair we notice, that if the sum of remainder of the each pair is k.  So we maintain a hash table and maintain the frequency of remainder of each array element when divided by k.Find remainder of current element.. Now we need to check 3 cases –
    1) If remainder divides k into two halves, then there must be even occurrences of         it as it forms pair with itself only.
    2) If remainder is 0, then then there must be even occurrences of the remainder in the hashmap.
    3) Else, number of occurrences of current remainder  must be equal to number of         occurrences of “k -current remainder” (as we know the sum of remainder in  apir is equal to k).


  11. *****  Given an array of distinct integers, find if there are two pairs (a, b) and (c, d) such that a+b = c+d, and a, b, c and d are distinct elements. If there are multiple answers, then print any of them. GFG. Very important question. So directly copying the pseudo code here. InterviewBit
    Algo: Time Complexity: O(n2)
    We add the two numbers. If the sum is present in that HashMap than one sum other pair exists else it doesn’t.

    Loop i = 0 to n-1 :
        Loop j = i + 1 to n-1 :
            calculate sum
            If in hash table any index already exist
                Then print (i, j) and previous pair 
                from hash table  
            Else update hash table
        EndLoop;
    EndLoop;

  12.  Given an array of pairs, find all symmetric pairs in it  – Two pairs (a, b) and (c, d) are said to be symmetric if c is equal to b and a is equal to d. For example (10, 20) and (20, 10) are symmetric. Given an array of pairs find all symmetric pairs in it.
    Algo:
    1. Brute force
    2. Use Sorting. Sort the given pair by the first key in the pair. Then do a binary lookup for the second key in the pair.
    3.  Hashing.  First element of pair is used as key and second element is used as value. The idea is traverse all pairs one by one. For every pair, check if its second element is in hash table. If yes, then compare the first element with value of matched entry of hash table. If the value and the first element match, then we found symmetric pairs. Else, insert first element as key and second element as value.
    Note: When questions specifies that element are given in pair, either consider the input as a 2D array or two 1D array with with pair element i position mapped to i position in second array.


  13. **** Find the largest subarray with 0 sum  – Given an array of integers, find length of the largest subarray with sum equals to 0.
    Algo: Use Hashing to solve this problem in O(n) time. The idea is to iterate through the array and for every element arr[i], calculate sum of elements form 0 to i.
    If the current sum is seen before it means that there was an array in between that summed up to 0 and that’s what we are interested in. So we find the last index of the sum for HashMap. The length will be (last_sum_index – current_index).
    Edge Case:
    – if the number itself is zero then it can also be classified as a substring if there’s no other subarray.
    – if the total sum is 0, then that the max length will be current_index + 1


  14. *** Given an array, print all subarrays in the array which has sum 0GFG
    Algo: This is almost same as the above question. Here instead of focusing on the largest subarray we want to get all those subarray. While finding the largest subarray we just stored the first index of a sum that was repeated as at any point the difference of that index with it current_index will give me current subarray length. Here we want all the subarray. So we will store a list of all the index at which the sum was repeated. At the end we can get all pair combination for each HashMap index.


  15. **** Largest subarray with equal number of 0s and 1s  – 
    Algo: Consider all 0 values as -1. The problem now reduces to find out the maximum length subarray with sum = 0 as above.


  16. *** Given an array of size n and an integer k, return the of count of distinct numbers in all windows of size k. 
    Algo : The basic idea is to use a sliding window concept. For the first window we maintain the frequency of element. When we move to the next window, we just decrease the frequency of the last element of the previous window(if it’s frequency was 1 then the element is removed from HashMap). Maintaining the window is helpful to not re-calculate the entire frequency of each element in window. To get the count of unique element we can use the size() method on map (as we remove element with 0 frequency, only the element in current window are maintained in Map).


  17. *** Length of the largest subarray with contiguous elements – Given an array of distinct integers, find length of the longest subarray which contains numbers that can be arranged in a continuous sequence.
    Algo: It is given that all elements are distinct. If all elements are distinct, then a subarray has contiguous elements if and only if the difference between maximum and minimum elements in subarray is equal to the difference between last and first indexes of subarray. So the idea is to keep track of minimum and maximum element in every subarray.
    Check the implementation. We need to consider sub-array starting at each index. So for an index i, keep adding the index element to the pair and keep checking the max and min in the current window. If the diff of the max – min in that window equal i -j then there’s a subarray in that window.
    So for each ith position we find the length of the maximum sub-array. All the consecutive numbers occur in a series(though not ordered).
    Time Complexity: O(n^2)


  18. *** Length of the largest subarray with contiguous elements (with duplicate)
    Algo: Same as the above problem but with duplicate. If duplicate numbers occur in consecutive position then the condition can’t be satisfied. So we use hash to maintain this check if the same element  is already present in that sub-array.
    Here at each outer loop we re-initialize the hash set, with the new i value. If any of the element is repeated then the sub-array can’t be formed. So if we detect any repeated element in the inner loop we can proceed to a new sub-array(break out of inner loop).


  19. *** Longest Consecutive Subsequence  – Given an array of integers, find the length of the longest sub-sequence such that elements in the subsequence are consecutive integers, the consecutive numbers can be in any order.
    Algo:
       1. Sorting and loop the array and ….Complexity = O(nLogn)
       2. Hashing. Complexity: O(n). We basically need to check that for any number it’s consecutive numbers should be present. This is why hash is useful as it’s O(1) for lookup. We try to find the first number in that sub-sequence so that we can check if the rest of the numbers are present, using a simple loop and using using hash.
    If we don’t start with the first number in the sequence then it will have to keep a track of the number in the sequence that we have already seen.
    To start we add all the numbers to hash. For each number we check if it’s the first number in the sequence( how ? check if a[i] -1 is present in hash(numbers are consecutive)). If yes we start with the number and keep checking the how many consecutive numbers are present. If it’s not the first number we don’t check anything.
    Note: It’s different from Q17 and Q18, in the sense that all the continuos element are mixed up in a fixed size window and we need to find that window length.
    In this question the numbers are continuos and scattered in that array and we need to find the max length of such sequence.


  20. Given a singly linked list check if has a loop or not using Hash.
    Algo: Start traversing that List and for each node store the address in the hash table. If the address wasn’t present then store it in table. If the address was present then the element has been already visited.


  21. Given an unsorted array with repetitions, the task is to group multiple occurrence of individual elements. The grouping should happen in a way that the order of first occurrences of all elements is maintained. GFG
    Algo:
    1. O(n^2). Start traversing the array. Maintain an array/ds which keeps track if the element has been visited of not. For each occurrence of the number run an inner loop and print the other same number. These will be printed in order. Now start with the next element if it’s not already printed.
    2. We need to print the element that occurred first in the array  in the output first.
    We basically maintain the frequency of each element in a Hash Table and in the end print the element that number of time. If we implement this using a HashMap
    then there’s no guarantee of the order.  So we use LinkedHashMap, it differs from HashMap in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is normally the order in which keys were inserted into the map (insertion-order). Note that insertion order is not affected if a key is re-inserted into the map. (A key k is reinserted into a map m if m.put(k, v) is invoked when m.containsKey(k) would return true immediately prior to the invocation)


  22. **** Find subarray with given sum – Given an unsorted array of non-negative integers, find a continous subarray which adds to a given number.Algo – Sliding Window concept. When the sum of the window exceeds the required sum we start removing the elements from the starting of the window till the sum is either equal or less then the required sum and the window start index is less then the current index (While removing this we keep track of the start of the window and keep updating it).
    Time complexity is O(n). We can prove it by counting the number of operations performed on every element of arr[] in worst case. There are at most 2 operations performed on every element: (a) the element is added to the curr_sum (b) the element is subtracted from curr_sum. So the upper bound on number of operations is 2n which is O(n).
    Variation 1: Handle the above case if negative numbers are involved: ???


  23. Given a string, find the first repeated character in it. We need to find the character that occurs more than once and whose index of first occurrence is smallest. GFG
    Algo: Create a simple HashMap and do lookup.


  24. [Pending] Write all solutions for a^3 + b^3 = c^3 + d^3. SO
    Algo:


  25. ** Write an efficient function that checks whether any permutation of an input string is a palindrome –
    Algo: For a string to palindrome, if it’s length is even then each character should occur even number of times, in case it’s odd, only one char can occur once, rest have to occur even number of time.


  26. Word Cloud Algorithm – You want to build a word cloud, an infographic where the size of a word corresponds to how often it appears in the body of text. Question
    Algo: The basic idea is to parse all the words form the sentence. Then convert all the word to lowercase and insert in a HashMap with word as key and value as frequency.  We could use lexing and parsing library to implement a # real world’s problem’s requirements.
    Note:
     This is just for reference. Here’s how Wordle( the best word cloud generator ) actually works:Count the words, throw away boring words, and sort by the count, descending. Keep the top N words for some N. Assign each word a font size proportional to its count. Generate a Java2D Shape for each word, using the Java2D API.Each word “wants” to be somewhere, such as “at some random x position in the vertical center”. In decreasing order of frequency, do this for each word:

    place the word where it wants to be
    while it intersects any of the previously placed words
        move it one step along an ever-increasing spiral
    

    That’s it. The hard part is in doing the intersection-testing efficiently, for which I use last-hit caching, hierarchical bounding boxes, and a quadtree spatial index


  27. Find Duplicates file – You left your computer unlocked and your friend decided to troll you by copying a lot of your files to random spots all over your file system.
    Find and delete those files.
    Algo:  We can calculate the hash of each file to verify that two files are same. Calculating hash will make your program run slow. Its better you also check the file size. All the duplicate file should have same file size. If they share same file size apply hash check. It’ll make your program perform fast.There can be more steps.
    Check if file size is equal –
    – If step 1 passes, check if first and last range of bytes (say 100 bytes) are equal
    – If step 2 passes, check file type,
    – If step 3 passes, check the hash at last
    The more criteria you add the more faster it’ll perform and you can avoid the last resort (hash) this way.
    Related Info:
    Checksum is the way to go here(in real world application)  What checksum algorithm should I use?   – know when to use CRC/md5/sha family
    This checksum logic has multiple application. The most common when downloading file of the internet or packet transmission over the net. Or like checking if a file has been updated and only then upload it.
    We will use these java method toFile and readAllByte


  28. *** Two words are anagrams if one of them has exactly same characters as that of the another word. Find the anagrams for a given word. SO
    Algo: The general approach to check if two strings are anagram is to first sort the string or to count frequency of each character.
    About the above question I am not clear with the question as to what is wanted. I assume it means we need to generate to identify anagram from a list of words.
    1. Sorting and comparing is an option.
    2. The second approach uses ascii equivalent to generate character to generate the key instead of sorting the string. This value is generated is then inserted into the hash table. If we use the ascii equivalent then there’s a possibility of collision so we can instead use prime number in it’s place like below (26 prime number for 26 characters) –
    private int[] PRIMES = new int[] { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113 };


  29. *** Colourful Number  – Code
    Algo: Generate all the combination of the number given(see  this part in code). Then calculate the product of all digit of each combination. If the product exists in hash then it’s not a colourful number.


  30. 2 Sum Problem – Given an array A[] of n numbers and another number x, determines whether or not there exist two elements in S whose sum is exactly x.
    Algo:
    1. Sort the number and use two pointers
    2. Then traverse the given array and lookup for x – A[i] in the HashMap. If it does not exists put A[i] in Map.


  31. *****  4 Sum Problem – Given an array of integers, find all combination of four elements in the array whose sum is equal to a given value X. GFG Other Link
    Algo: We will use the same logic as for  problem 11 on this page(a + b = c + d).
    The idea is to consider every pair of elements in the array one by one and insert it into a map. For each pair of element (i, j), we calculate the remaining sum. If remaining sum exists in the map and elements involved in previous occurrence don’t overlap with the current pair i.e. ((i, j, i, y) or (i, j, x, i) or (i, j, j, y) or (i, j, x, j)), we print the Quadruplet and return.
    This solution is not suggested in GFG but it present in the comment and in the other provided above.


  32. Given an unsorted array and a number n, find if there exists a pair of elements in the array whose difference is n. GFG
    Algo:
    1. Brute Force
    2. Sort the array and use binary search.  Once the array is sorted, traverse the array from left to right, and for each element arr[i], binary search for arr[i] + n in arr[i+1..n-1]. If the element is found, return the pair.
    3. Use Hashing.


  33. Given two integers representing the numerator and denominator of a fraction, return the fraction in string format. If the fractional part is repeating, enclose the repeating part in parentheses. Code
    Example :
    Given numerator = 1, denominator = 2, return “0.5”
    Given numerator = 2, denominator = 1, return “2”
    Given numerator = 2, denominator = 3, return “0.(6)”
    Algo:
    We need to follow the basic remainder method.


  34. Given N point on a 2D plane as pair of (x, y) co-ordinates, we need to find maximum number of point which lie on the same line. GFG. Code.
    Algo: The idea is to calculate the slope. For each point p, calculate its slope with other points and use a map to record how many points have same slope, by which we can find out how many points are on same line with p as their one point. For each point keep doing the same thing and update the maximum number of point count found so far.
    We need to take care of the edge cases like if the points overlap then they won’t be counted as two points and if the points are vertical. We need to use double for precision for the slope value.


  35. *** Find all starting indices of substring(s) in S that is a concatenation of each word in L exactly once and without any intervening characters. GFG
    Algo: The idea is to use sliding window here.
    We know the max length of the window by combining all the words in the given list. As these word can be in any order the way we will calculate hash of this window should be simple – we use ascii value to calculate the hash of the combination of the given words. This way the hash is not changed even when the order changes.
    Now we pick first window and match it’s hash with our hash. If not found, we move one step ahead but we don’t need to calculate the entire hash of the window again. Just remove the first char of last window and add the char of the new char included in win.


  36. [Pending] Find the smallest window in a string containing all characters of another string.

  37. [SKIP] Palindrome Substring Queries

  38. Given an unsorted array and a number n, find if there exists a pair of elements in the array whose difference is n.  GFG.
    Algo:
    1. Hashing is an obvious solution – O(n) – time and space.
    2. Another way is to use sorting and then search with two pointer. Overall complexity will be O(nLogn)


  39. We have an array of integers and we have to find two such elements in the array such that sum of these two elements is equal to the sum of rest of elements in array. GFG.
    Algo: Find sum of all elements. Start looping through the array. for each index i, we do sum – a[i]. Check if this value exists in hash. If yes, pair found, else add the a[i] to hash.


  40. Given an unsorted array and a number n, find if there exists a pair of elements in the array whose difference is n. GFG.
    Algo: Create an empty hash table HT. Traverse the array, use array elements as hash keys and enter them in HT. Traverse the array again look for value n + arr[i] in HT.


Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s