Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Panel

On this page:

Table of Contents

Tip

Starting from Alex Xu (book), a 4-step process for effective system design would be great !

Step 1 : Understand the problem and establish the design scope

  • Understand the context and the constraints : ask questions to establish the type of constraints (technological, infra, etc.).
  • What specific features we are going to build ?
  • How many users does the product have ? How many users do we expect in the future ?
  • How fast the company anticipate to scale up ? 3 months, 6 months and 1 year ?
  • What is the company's technology stack ? What is the technological legacy ? What existing services we might leverage to simplify the design ?

Example of questions : What is the traffic volume ? What is the expected traffic volume after a time X ? etc.

Step 2 : Establish a high-level design 

  • Establish an initial blueprint for the design according the step 1 : for example, is the organization going hybrid in the first step ?
  • Draw the design with key components
    • client layer (mobile/web, etc...), APIs, Web servers, data store, cache, CDN, message queue, etc.
  • Estimate the system capacity or performance requirements (See the page "Back-of-the-envelope calculation").

If possible, go through a few concrete use cases says Alex Xu !


Step 3 : Design deep dive

After the step 2 and be agreed on the overall goals.

  • Focus on deep dive based on the feedback

  • We can identify and prioritize components in the architecture; if possible, we could focus on the bottlenecks and resource estimations ("Back-of-the-envelope calculation").


Step 4 : Establish the pros and the cons

It's a final step that could be decisive : a wrap up.

  • Establish the pros and benefits
  • Establish the cons and considerations:
    • Is there any single point of failure in our system? What are we doing to mitigate it?
    • Do we have enough replicas of the data so that we can still serve our users if we lose a few servers?
    • Similarly, do we have enough copies of different services running such that a few failures will not cause a total system shutdown?
    • How are we monitoring the performance of our service? Do we get alerts whenever critical components fail or their performance degrades?