Scaling Database : the 5 distribution models or architecture
Introduction
Your database must be scaled in such a way that it is able to handle traffic at any point of time, considering that the traffic would vary considerably with respect to time. Traffic is highly unpredictable and you must be able to sustain it during the peaks.
The scaling process should not lead to a downtime in your database. True that the records can change during the scaling process, you must ensure data consistency between pre-scaling and post-scaling databases.
Model 1 - Single Server
The problem with single servers is that you have to scale up if you want better performance. Large servers, however, are not cost efficient as they are run for all the time, even though the loads are high only for small amounts of time.
Model 2 - Shards DB
Such a technique reduces the size of the database on each node and hence, improves performance. If a shard is based on some real world segmentation of data (European and American customers for instance), it is easier to decide where to direct a query. Another factor of consideration is to keep the load even.
Sharding has its disadvantages too. It brings about complexity in the whole structure, and also leads to a single point of failure, which means the corruption of a single shard can lead to failure of the whole table.
Model 3 - Master-Slave Replication
Make copies of the same data over many nodes. One of the nodes is designated the master, which is responsible for updating data on the other nodes, called the slaves. Such a system is useful when you have a read-intensive dataset. if the master fails, your read queries would work normally, a process known as read resilience but writes would fail until the master is restored or a new one is appointed (which can be done automatically).
Master-Slave replication can lead to inconsistency. Reading different slaves can give different values, if the data hasn’t been updated by the master yet. Worse, a recent write can not be read by a client. Also, if a master fails, there is a possibility that recent updates are not passed on to the slaves.
Automatic failover (or recovery) with replication
Failover or recovery should be automatic. Apart from the minimum requirement of two MySQL servers replicating each other, there is a third server, called the manager, that controls both MySQL server through an agent that is running on each server. The manager controls and monitors the state of the replication and assign virtual IPs for specific roles. There are one VIP where write operations are sent and two or more VIPs where read operations are sent. If replication on one of the MySQL servers lags behind too much, its read VIP will be moved to another server.
Model 4 - Peer-to-Peer Replication
Peer to Peer Replication takes care of the problems with the Master-Slave distribution model. It removes a master so that every node has read as well as write access.
The biggest problem here is consistency. Sometimes, writing the same data in two different nodes can lead to a conflict. Write conflicts can be solved by merging inconsistent writes, but discussing ways to do that is beyond the scope of this post.
Model 5 - Combining Sharding & Replication
Replication and sharding are ideas that can be combined. Using master-slave replication and sharding means that there can be multiple masters, but each data point has only a single master. Each shard can have any number of peers, where, in case of failures, the data is built on other nodes.
In the final analysis, you have to determine the extent to which you should scale because this process requires money.