Database Sharding

What’s the difference between sharding DB tables and partitioning them?

I take sharding to mean the partitioning of a table over multiple machines (over multiple database instances in a distributed database system – each having it’s own replication node setup), whereas partitioning may just refer to the splitting up of a table on the same machine. So a table that is sharded has been partitioned, but a table that has been partitioned has not necessarily been sharded.

Sharding is a method to distribute data across multiple different servers. MongoDB achieves horizontal scalability through sharding.

Sharding is the equivalent of “horizontal partitioning”
.  When you shard a database, you create replica’s of the schema, and then divide what data is stored in each shard based on a shard key.  For example, I might shard my customer database using CustomerId as a shard key – I’d store ranges 0-10000 in one shard and 10001-20000 in a different shard.  When choosing a shard key, the DBA will typically look at data-access patterns and space issues to ensure that they are distributing load and space across shards evenly.

A good article from InterviewBit describing the case of sharding. It also explains the case of how to distribute the load evenly across multiple shard with fault tolerance.

Video explaining Sharing in MongoDB

  • Q: What is the amount of data that we need to store? 
    A: Let’s assume a few 100 TB.
  • Q: Will the data keep growing over time? If yes, then at what rate? 
    A: Yes. At the rate of 1TB per day.
  • Q: Can we make assumptions about the storage of machines available with me? 
    A: Let’s assume that machines have a RAM of 72G and a hard disk capacity of 10TB.
  • Q: How many machines do I have to begin with? 
    A: Let’s assume we have 20 machines to begin with. More machines will be available on request if need be.
  • Q: Are all key value entries independent? 
    A: Yes. A typical query would ask for value corresponding to a key.

Estimation:

Total storage size : 100 TB as estimated earlier 
Storage with every machine : 10TB

Q: What is the minimum number of machines required to store the data?
 
A: Assuming a machine has 10TB of hard disk, we would need minimum of 100TB / 10 TB = 10 machines to store the said data. Do note that this is bare minimum. The actual number might be higher.  In this case, we have 20 machines at our disposal.
Q: How frequently would we need to add machines to our pool ?
 
A: The data grows at 1TB per day. That means that we generate data that would fill the storage of 1 machine ( 10TB ) in 10 days. Assuming, we want to keep a storage utilization of less than 80%, we would need to add a new machine every 8 days.

Deep Dive:

Q: Can we have a fixed number of shards?
A: One qualification for a shard is that the data within a shard should fit on a single machine completely. 
As in our case, the data is growing at a fast pace, if we have a fixed number of shards, data within a shard will keep growing and exceed the 10TB mark we have set per machine. Hence, we cannot have a fixed number of shards. The shards will have to increase with time.

Q: How many shards do we have and how do we distribute the data within the shard?
A: Lets say our number of shards is S. One way to shard is that for every key, we calculate a numeric hash H, and assign the key to the shard corresponding to H % S. 
There is one problem here though. As we discussed earlier, the number of shards will have to increase. And when it does, our new number of shard becomes S+1. 
As, such H%(S+1) changes for every single key causing us to relocate each and every key in our data store. This is extremely expensive and highly undesirable.
Q: Can we think of a better sharding strategy?
A: Consistent hashing is ideal for the situation described here. Lets explore consistent hashing here. 
Let’s say we calculate a 64 bit integer hash for every key and map it to a ring. Lets say we start with X shards. Each shard is assigned a position on the ring as well. Each key maps to the first shard on the ring in clockwise direction. 
 
What happens if we need to add another shard ? Or what if one of the shard goes down and we need to re-distribute the data among remaining shards? 

Similarily, there is a problem of cascading failure when a shard goes down. 

Modified consistent hashing 
What if we slightly changed the ring so that instead of one copy per shard, now we have multiple copies of the same shard spread over the ring.

  • Case when new shard is added : 
  • Case when a shard goes down : No cascading failure. Yay! 

 

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s