- Suppose I’m given a large dictionary in flat file with 200 million words and my function needs to check the existence of any given word in the dictionary, what’s the fastest way to do it? You can’t store the dictionary in the memory because you only have 1GB of memory. You can store it in the database, however querying it would still be very very slow without any optimization. You can’t index the full words because you don’t have enough resources. StackOverFlow
1. Binary Search
2. Bloom Filter !!
4. Could hash if the dataset was small, not in this case.
- “Given a sequential file containing 4,300,000,000 32-bit integers, how can you find one that appears at least twice? SO
1. Bit Array – something likes redis bitset
2. Pigeonhole principle
- Find number range intersection. What is the best way to find out whether two number ranges intersect? SO Pseudo Code
My number range is 3023-7430, now I want to test which of the following number ranges intersect with it: <3000, 3000-6000, 6000-8000, 8000-10000, >10000. The answer should be 3000-6000 and 6000-8000
- Write a method to get the greatest common divisor(GCD) of two given integer.