Sharding system architecture: Several things you may need know

Andy SongJanuary 25th, 2012Last Updated: October 21st, 2012

1 43 5 minutes read

[Introduction]

Some of you may wondering what’s sharding? If you don’t know the concept, please read the link http://en.wikipedia.org/wiki/Shard_(database_architecture). It’s very useful when you want to deal large data set throughput and traditional single data cluster group (Master-Slave, Master-Master and Slave-Slave, etc) setting couldn’t survive from that kinds of situaion. It’s concept is very simple and you may know in advance “Divide and Conquer”.

It has very different characters that quite different with Partition technique (You may know that from some database providers), since Partition was database layer specific technique but Sharding could spread across all layers you could touch.

For examples, You have one web online product it holds only one page, and that page need load data from one table. And imagining if you have 1 billion data persist in that table which was located in single oracle/my-sql/ms-sql server and you query againts that table in average need cost 2 seconds under load 1000 TPS. It’s not bad you may said. But please remember that only is database costs for one mature product you may need consider data transformations, network transportation (don’t be confused, here I mentioned correlated to from application bewteen end user), page rendering, etc. At the end customer may need pay 5 or 10 seconds for that page action each time in average. Your stakeholder want you provide solution for that issue.

You may come up with other solutions first, but here I could give you one very special. If one table with 1 billion datas need 2 seconds, how about we divide that table into 5 pieces and each piece holds only 200 millions datas. That may related to one math question you may think, possible would be 400 milliseconds, but the result may give you surprise << 400 milliseconds (why two below signs?). Because the database features: the primary data files become more smaller, the indexe files become more smaller, etc. so that means the I/O rounds have been significantly reduced. Okey, you may say “wow” that works for me and go ahead with that proposals with your stakeholds. Come on, calm down. You need resolve many things before any action against this.

[Things you may know or not]

Which kinds of rules you want take for dividing the data?

How to divide? You are idiot? (You may think). Very simiple 1 ~ 200 million, 201 million ~ 400 million, etc. Let me tell you.

1. based on row number

2. based on date (year, month or day)

3. based on business data

3.1 If your data contains user information. User identity, male/femal, user location

3.2 If your data contains some unqie enumerations. For example, plane ticket booking system. You may divide data acoording to the plane type and airplane company.

4. Mixed-up all of aboves.

How you transfer request to the specific regions acoording to data division?
Right now challenge come (if without database help) how could transfer request correctly to the specific regions acoording to data division? Let’s imagine, how could you find one phone number from yellow page book? How would the internet controller could work with domain name? So that means we need somewhere persist the division logics or mappings, so that means after we divide data and we could use the same logics or mappings find the correct data again and again. Here we call that as DNS.

How could we finish some aggregation jobs after data division?
Total counts, min, max across all the datas, this feature was very useful in some kinds of situation especially customer need that. But after division, all aggregation features was gone since database not hold all the data never. You may need work out solution as you own. More worse situaion you need work out cross data division join features. That effort I thought could not be estimated.

All other layers except data layer will be impacted also
Before data division you data access layer could static the db connection information since that’s not changed during runtime. But after data division your data access layer need accomodate the db connection information will be changed frequently due to the data has been splitted.
Cache layer may also need persist more information especially for the data location information
Restful interface was impacted also you may need always transfer the key data (that help division) back and forth.

Data consistency you may lose in some level
Since data was splitted, you may end up with need update data across different data regions. XA wasn’t good from performance and scalability perspective, because those two factors drive you towards to data sharding. So ACID unfortunatly cannot be covered.

Data re-division
Since data size will grow every day, you may end up with some data region grow unexpectedly larger. You need re-division again. And you may need go through above issues again.

High maintenance effort
You may need work out tools help you manage and monitoring those data region.

[Black or white]

You may say “come on, so crazy, I will never try”. But here please know its advantages.

High scalability
Since your large data region become many smaller region, that means your throught also be splitted up either. So you could provide more capacity from throughput pespective, so could reach high scalability more easier.

High performance
Query against smaller data with better performance, I think that is very easy math question. With large in-memory cache, data grid or cloud data service, you may even going to get more better performance.

High availability
Because you have many regions, so that means customer may also be splitted either. Less data means you could hold more capacity room in one single region, OOM, ConnectionReset, Running out of connection limition, Connection Timeout, etc you may not meet often comparing to traditional solution.

Smaller portion customer impact when something happen (planned or un-planned)
Because you have many regions, so that means customer may also be splitted either. So even one portion of your regions crashed won’t impact other regions, and you could provide better site-down deployment services. Like you could deploy China region at 00:00:00 of china timezone, USA region at 00:00:00 of usa timezone.

Recovery will be easier
Smaller data recovery will be less effort and more quick.

Note: all of above benefits need your architecture provide better design in advance, otherwise you won’t reach the gold and may meet more worse situation.

[Nothing is impossible except Death or Debt]

First, consider your business model. Is data consistency was very critical to customer?
If your customer could allow the window existence for data inconsistency (well-know data latency). You could go for with this solution. Flicker, twitter, etc adpot this solutions.

Second, sharding as earlier as possible
If your system architecture from very beginning without consider sharding, changed from non-sharding to sharding will be big headache. But incrementally migrating could be acceptable, but that will more testing your system design ability. More backward-compatibility will be invovled which also is very challengeable. So try you best make it happen as earlier as possible.

Third, sharding up-front
Sharding up-front will be better choice, since that will reduce the complexity, reducing the coupling between many components and provide the enough flexibility. That will be disucssed in the future topics.

Fourth, let service fly
Try you best using service arming every components especially hidden the location information. So if one type tuning or improvement could be spread across the same component service.

Fifth, single function service rather than huge monothlic funcions service
Split the multi-functional components into multiple single functional components would help you a lot. Not only from design perspective but also from organization perpective.

Sixth, re-organ your teams
Sharding need you more consider or take care of product matainence and monitoring, so you should enlarge product running teams. Like facebook, they have a speical team working for feature deployments and monitoring.

Seventh, keep changing

[Summaries]

Sharding need many things changing and planning quite different from traditional model. But it did provide many benefit that you cannot decline with that.

1. http://en.wikipedia.org/wiki/Shard_(database_architecture
2. http://highscalability.com/blog/2009/8/6/an-unorthodox-approach-to-database-design-the-coming-of-the.html
3. http://highscalability.com/scaling-twitter-making-twitter-10000-percent-faster

Reference: Several things you may need know about Sharding system architecture from our JCG partner Andy Song at the song andy’s Stuff blog.