Versions Compared
Version | Old Version 21 | New Version 22 |
---|---|---|
Changes made by | ||
Saved on |
Key
- This line was added.
- This line was removed.
- Formatting was changed.
Tip |
---|
The Framework proposed in this space (Alex Xu) is applied to propose a design : Getting started - a framework to propose... |
On this page.
Table of Contents |
---|
Introduction
A chat performs different functions for different people. It is important to explore the feature requirements.
Step 1 - Understand the problem and establish design scope
Kind of chat ? 1 to 1 or group chat ? | both |
---|---|
Mobile or Web app ? or both ? | both |
Scale of the app ? start-up or massive scale ? | massive scale : 50 millions per day of active users. |
Limit for a group chat ? | Max 100 people = small group. |
Features ? Support attachment ? | 1 on 1 chat, group chat, online indicator, ONLY supports textes messages. |
Limit of text message ? | Less than 100 000 characters long. |
End-to-end encryption required ? | Not required for now, but maybe. |
Chat history ? | Forever |
Push notifications ? | Yes. |
Online presence ? | Yes. |
Multiple device support ? | Yes, can be logged in multiple accounts at the same time. |
Step 2 - High-level design
For a chat service, the choice of the network protocols is important : HTTP connection could be a good option on the server-side, but the problem occurs on the client-side. There are 3 techniques to simulate a server-initiated connection: polling, long polling and WebSockets.
See polling & long polling in the page References & Glossary for chat system.
WebSockets is the most common solution for sending asynchronous updates from server to client. WebSockets (WS) is used for both sender and receiver sides.
data:image/s3,"s3://crabby-images/7eef2/7eef2afa0fc9c868eca5eb7973c29eb26ca87edc" alt=""
High-level shows 3 major categories : stateless services, stateful services and third-party integration. High-level architecture is already scalable because a single server design is a deal breaker (single point of failure).
data:image/s3,"s3://crabby-images/aec96/aec96e5d3115b5ecc954df76358e8b734d687632" alt=""
Client maintains a persistent WebSocket connection to a chat server for real-time messaging :
Chat servers facilitate message sending/receiving.
Presence servers manage online/offline status.
API Servers handle user login, signup, change profil, etc.
Notification servers send push notifications.
KV Store to store chat history : when offline, user see all previous chat history. See the page References & Glossary for chat system - Storage.
Step 3 - Design deep dive
Service Discovery
data:image/s3,"s3://crabby-images/7a9d1/7a9d1498999ebf237125a5a46f120d825e339291" alt=""
User A tries to log in to app.
The LB sends the login request to API Servers.
After backend authenticates the user, service discovery finds best chat server for User A.
User A connects to chat server through WebSocket.
Message Flows (Message synchronization across devices & group chat flow)
1 on 1 chat flow
data:image/s3,"s3://crabby-images/207b7/207b78ace1113d16502270f01a679895c6f9d242" alt=""
User A sends a chat message to Chat server 1
Chat server 1 obtains a message ID from ID generator
Chat server 1 sends message to Message Queue (Sync)
Message is stored in KV Store.
…
if User B is online = message is forwarded to chat server 2 where User B is connected.
if User B is offline = a push notification is sent from push notification servers.
Chat server 2 forwards message to User B.
Message Synchronization
data:image/s3,"s3://crabby-images/292ab/292ab0939951c011ae1a5633f20578bb50b7d55d" alt=""
Each device maintains a variable called “cur_max_message_id” which keeps tracks of the latest message ID on the device.
To have new message, 2 conditions :
recipient ID is equal to currently logged-in user ID.
message ID in KV store is larger than “cur_max_message_id”.
Note |
---|
With distinct cur_max_message_id on each device, synchro is easy as each device get new messages from KV store. |
Small group chat flow
data:image/s3,"s3://crabby-images/edefe/edefec3092272188e58449aaae90ba281ce4b052" alt=""
This simplifies message sync flow as each client only needs to check its own inbox to get new messages.
When group number is small, storing a copy in each recipient’s inbox is not too expensive.
On the recipient side, a recipient can receive messages from multiple users (see diagram below).
data:image/s3,"s3://crabby-images/78b77/78b776f9a4b01a44c60927f5c3dd2959e6665ac9" alt=""
Online Presence
This indicator is an essential feature of many chat applications.
User login
The user login flow is explained in the “Service Discovery” section : So a WebSocket is built between client and real-time services.
2 variables are stored in KV Store : online status & last_active_at timestamp.
data:image/s3,"s3://crabby-images/b4c33/b4c330c1a9ef8ba3a74ba8105b5dfc9d6d9e1dad" alt=""
User logout
data:image/s3,"s3://crabby-images/bca98/bca98ab449645dac2a027eab4066b58cd1aeffbf" alt=""
User disconnection
When user disconnects from internet, the persistent connection between the client and server is lost. We cannot update statuts on every disconnect/reconnect, it’s creating a poor user experience.
Implementation of “heartbeat event” : sending an event every x seconds.
data:image/s3,"s3://crabby-images/a2ae9/a2ae9fc664c1debd54c94d3b699b0a592038a61d" alt=""
How do user’s friends know about the status changes ?
Presence servers use a publish-subscribe model in which each friend pair maintains a channel.
data:image/s3,"s3://crabby-images/b22db/b22db9f18df9ce1833f04a73f44ef7b49c77ea6e" alt=""
Step 4 - Pros & Cons
Pros: Decoupled architecture, real-time communication.
Cons: Extend the app to media files (photos, …); end-to-end encryption not added; caching messages on client-side is more effective to reduce data transfer between client and server; improve load time (with caching); error handling (chat server error with zookeeper service, message resent mechanism with retry technic)