System Design - Scaling the database

System Design Series - Tutorials and Interview Template

System Design - Introduction | System Design Primer | Beginners

System Design - Scaling the application | System Design Primer

System Design - Scaling the database | System Design Primer

SQL vs NoSQL: Should we use SQL or NoSQL? | Databases

System Design Interview Template | Ace the System Design Interview

Note: This is the 3rd part of the System Design series and assumes that you know the basic terminologies of System Design and know how to scale the application servers.

Vertical Scaling of Database

We can move our database to a larger machine and provide more resources. This will scale our database to handle the high load. But, Vertical Scaling is limited by the amount of resources that we can get in a single machine.

Replication/Redundancy

Let's look at the problems that our database is currently facing.

might not be able to handle the high traffic
is a single point of failure
might have increased latency for both 'reads' and 'writes'

Can we use the same idea of horizontally scaling our database?

In a way, yes but it would be a bit different. Recall that for horizontal scaling, nodes need to be stateless. The purpose of a database is to store 'state' and so it cannot be stateless. Let's look at how we can do it.

Redundancy: Redundancy means duplicating a component to ensure that our system continues to function even if there is a failure. Redundancy is what we created in our application servers to horizontally scale it.

We will do the same with our database. Create redundant copies of the database so that the system continues to function even if one of the database nodes goes down.

Replication: Databases are stateful systems. Redundant copies need to stay in sync so that they represent the same state. We can do that by replicating the changes to the other databases. This process is known as replication.

Replication Schemes

There are three ways of doing replication:

Synchronous replication: Changes are propagated to all the database nodes before considering the transaction successful. If any of the operations fails, the entire transaction fails and the changes are rolled back.
Asynchronous replication: Changes are propagated asynchronously after the transaction has been considered successful.
Semi-synchronous replication: The replication is partially synchronous. A transaction is considered successful after getting replicated on x number of nodes. The node does not wait for acknowledgment from other nodes.

Let's look at the guarantees that we get if we have synchronous replication vs asynchronous replication:

Synchronous replication: the system is consistent as all the databases would have the same data at a moment. The database might be unavailable while the replication is happening. Synchronous replication helps achieve strong consistency.
Asynchronous replication: the system might be inconsistent while the replication is happening. But, the database would be available during that time. Asynchronous replication can help achieve eventual consistency.

Types of replicas

There are two different types of replicas:

Write Replicas (Master/Leader Nodes): Write replicas are replicas that support 'writes' (create/update/delete). They may or may not support 'reads'.
Read Replicas (Slave/Follower Nodes): Read replicas are replicas that only support 'reads'.

Types of replication architectures

There are two major architectures based on database replication:

Master-slave architecture:

In a master-slave architecture, the master node serves the 'read and 'write' requests. The master node replicates the changes to the slave nodes. The slave nodes only serve 'read' requests.
If the master dies, one of the slave nodes is promoted as the master, or a new master is provisioned.
A standby master can also be considered. It is a replica of the active master. It becomes the active master if the active master dies.
Advantage:

The 'read redundancy' (slaves) distributes the 'read' traffic. Adding 'read redundancy' can scale the system as the traffic increases.
The system can continue to serve 'read' requests even when the master node dies. There is no single point of failure for 'read' requests.

Disadvantages:

Replication adds overhead. It might add latency if we choose synchronous replication. It might result in inconsistency if we choose asynchronous replication.
Consider that the master dies and the replication has not been completed. In this case, there might be some data loss as the newly promoted master might not have the latest changes.
The load on the master node might still be high if the number of 'writes' is high.
Replication adds a lot of complexity to the system.

Multi-master architecture:

In a multi-master architecture, all the nodes act as the master and can serve both 'reads' and 'writes'.
If any of the master nodes die, the system can operate as usual.
Advantages:

The 'read' and 'write' requests are distributed. We can scale the system by adding redundancy if traffic increases.
The system can continue to serve requests even when any of the nodes die. There is no single point of failure.
There is no need to select a new master when any of the master nodes die.

Disadvantages:

Replication adds overhead. It might add latency if we choose synchronous replication. It might result in inconsistency if we choose asynchronous replication.
Conflict prevention (in case of sync replication) or conflict resolution (in case of async replication) might be a big overhead.
Consider that the master dies and the replication has not been completed. In this case, there might be some data loss as the other master nodes might not have those changes.
Replication adds a lot of complexity to the system.

Partitioning and Sharding

To avoid the load on a database node, we can split the data in a way that the load is distributed across different nodes.

We can split a database generally in two different ways:

Vertical Splitting (Partitioning):

The database is partitioned into loosely coupled sub-databases.
Example: Tables related to Users and tables related to Products on different databases.
Advantages: Load distribution.
Disadvantages: Joins are not possible and application logic becomes complex.

Horizontal Splitting (Sharding):

The data is split based on some criteria. The different nodes will contain the same tables but with different data.
Examples:

User data with initials A-M on the first node and rest on the other.
User data for each continent on different nodes.
Product data for each category on different nodes.

Advantages: Load distribution.
Disadvantages: Joins are not possible and application logic becomes complex.

Joins are difficult with a partitioned or shared database. Hence, it is not easy to do splitting on relational databases. In the case of a sharded or partitioned relational database, we need to perform joins at the application level. Most NoSQL databases have sharding built-in. This makes sharding a preferred way of scaling NoSQL databases.

Other Database Scaling Optimizations

Indexing and Denormalization: We can do further optimizations like Indexing and Denormalization to make 'reads' faster. These techniques will make the 'writes' slower though. We should apply these only if that's acceptable.
SQL Tuning: SQL provides certain commands that we can use to measure performance. We can use them to figure out and fix slower queries.
Switch to NoSQL: Switching to NoSQL might make the database layer performant. But, it will come with few trade-offs. Most NoSQL databases do not support ACID transactions, in general). There are various NoSQL databases all best suited for a particular purpose.

In-Memory Caching

Recall that reading and writing data to RAM is much faster than storing data in a file-system-based data store (like MySQL or MongoDB).

Caches are data stores that can store data so that future retrievals are faster. In-memory caches are caches that store the data in-memory. Caching is the technique of using cache to store data for faster data retrieval.

Redis and Memcached are popular in-memory cache/datastores. They allow storing data as key-value pairs.

Example: Storing the product information against a product id.

Let's see how we can use cache to reduce the load on our database and decrease latency:

Whenever a 'read' request comes to our backend server, we can look up the required data on the cache before hitting the database.
If there is a cache-miss (required data is not present in the cache) then only we do a database query.
There are different techniques to fill the cache. Two of those techniques are:

In case of cache-miss, get data from DB and fill the cache.
In case of 'writes' to DBs, update cache before or after writing to DB.

The biggest advantages of using an in-memory cache like Redis are:

Latency: the latency of the requests can be significantly reduced by using an in-memory cache.
Reduced database load: the load on the database can be significantly decreased.

Everything in system design comes with a trade-off. Let's look at the disadvantages of using an in-memory cache:

Inconsistency: Cache-invalidation and knowing when to update the cache with fresh data can be difficult and might result in data inconsistency.
Expensive: RAM is expensive and so we cannot cache everything. We need to be selective in what to cache.
Complex Application Logic: The application logic can get complex as we need to check the cache before going to the database. We also need to do cache invalidation and updation.

Next parts: