Versions Compared
Version | Old Version 8 | New Version 9 |
---|---|---|
Changes made by | ||
Saved on |
Key
- This line was added.
- This line was removed.
- Formatting was changed.
Tip |
---|
The Framework proposed in this space (Alex Xu) is not exactly applied to propose a design : Getting started - a framework to propose... |
Introduction
A key-value store (KVS) referred to a key-value database (KVD), is a non-relational database.
Unique identifier is stored as a key with its associated value.
Key must be unique and value is accessed through key.
On this page.
Table of Contents |
---|
Part 1 - Understand the problem and establish design scope
Size of a KV pair | Small : less than 10 KB |
---|---|
Ability to store big data ? | Yes. |
High Availability: system responds quickly even during failures ? | Yes. |
High Availability: system can be scaled to support large data set ? | Yes. |
Automatic scaling: adding / deleting servers based on traffic ? | Yes. |
Tunable consistency ? | Yes. |
Low latency ? | Yes. |
Part 2 - KVS in Single Server

Even with the optimisations (data compression, frequently used data in memory and rest on disk), a distributed KV store is required for big data.
Part 3 - KVS in Distributed Environment

Real-world distributed systems
Partitions cannot be avoided. We have to choose between consistency and availability !
If consistency over availability because high consistent requirements = system coud unavailable.
if availability over consistency = system keeps accepting reads even though it returns stale data.
How to store & fetch data ? (data partition)
We can use the Consistent Hashing technic. Using this technic, gives 2 advantages :
Automatic scaling: servers could be added and removed automatically depending on the load.
Heterogeneity: number of virtual nodes (or replicas) for a server is proportional to the server capacity.
How to replicate data ? (data replication)
To achieve HA and reliability, data must be replicated asynchronously over N servers where N is a configurable parameter.
The replication is useful for preserving the data in a node. For replicating a key-value, we'll fetch next 2 (or 3, depending on replication factor / number of copies) distinct server nodes from the hash ring and save the copies into these nodes.