Versions Compared
Version | Old Version 13 | New Version 14 |
---|---|---|
Changes made by | ||
Saved on |
Key
- This line was added.
- This line was removed.
- Formatting was changed.
Tip |
---|
The Framework proposed in this space (Alex Xu) is applied to propose a design : Getting started - a framework to propose... |
On this page.
Table of Contents |
---|
Introduction
We can do a lot more on YouTube than watching a video. So, we must narrow down the scope.
Step 1 - Understand the problem and establish the scope
Required features ? | Upload and watch video only. |
---|---|
What clients do we need to support ? | Mobile App, Web App, Smart TV. |
How many daily active users ? | 5 million |
Average daily time spent on product ? | 30 min. |
Support international users ? | Yes. |
What are supported video resolutions ? | Most of the video resolutions and formats. |
Encryption required ? | Yes. |
File size requirement for videos ? | Max size = 1 GB. |
Leverage cloud infra (Amazon, Google, Microsoft) | Recommended to use existing cloud services. |
Focus : ability to upload videos fats, smooth video streaming, ability to change video quality, low infra cost, HA, scalability and reliability, clients supported (mobile, web and smart TV).
Back-of-the-enveloppe estimation
5 million daily active users (DAU)
5 videos per day.
10% of users upload 1 video per day.
Average video size = 300 MB.
Total daily storage = 5 million * 10% * 300 MB = 150 TB.
CDN cost :
CDN serves video, we are charges for data transferred out of CDN.
Average cost per GB is 0.02$ for USA.
5 million * 5 videos * 0.03GB * 0.02$ = $150000 per day.
On this page.
tocStep 2 - High-level design
data:image/s3,"s3://crabby-images/de5a3/de5a318cb3c5f9c79aac25c2cf738ed6ff7501ae" alt=""
From the design above, 2 flows could be considered:
Video uploading flow
data:image/s3,"s3://crabby-images/c9005/c9005377f26efcd59bc721ac0a6cc4af156d0006" alt=""
Transcoding servers : video transcoding is called video encoding. It’s the process of converting a video format to other formats (MPEG, HLS, etc.) which provide the best video streams and bandwidth capabilities.
Original and transcoded storage are BLOB storage.
Metadata cache is for better performance. Once the video has been uploaded, the user sends a request to update the video metadata (size, format, file name, …).
data:image/s3,"s3://crabby-images/75617/756170c3ab973f109d5874dffb1879a010d79ab5" alt=""
Video streaming flow
data:image/s3,"s3://crabby-images/04bb2/04bb29eb3bfc66e7a53e5428508911f7bbde8138" alt=""
Streaming protocol like Microsoft Smooth Streaming, Apple HLS, etc. Video are streamed from CDN directly and there is very little latency.
Step 3 - Design deep dive
The DAG Model (Directed Acyclic Graph)
DAG Model help to achieve flexibility and parallelism.
data:image/s3,"s3://crabby-images/0ed6c/0ed6cdb4c7b2104c1ffc1f32a6c87fc2cfd63cea" alt=""
Video transcoding architecture (leveraging cloud services)
data:image/s3,"s3://crabby-images/18294/18294b2aa50b4e65767651535573ddfa6e339ce9" alt=""
Preprocessor has 4 responsibilities:
video splitting into smaller GOP (Group of pictures).
Splits videos by GOP (Group of pictures) alignment for old clients.
Generates DAG based on config files client programmers writes.
Cache data.
DAG Scheduler splits a DAG graph into stages of tasks.
data:image/s3,"s3://crabby-images/aff39/aff39e59dd243e5bf98f3ac430d535870bdb7aa1" alt=""
Resource manager : It contains 3 queues and a task scheduler.
1 Task queue : priority queue containing tasks to be executed.
1 Worker queue : priority queue containing worker utilization info.
1 Running queue : contains info about currently running tasks and workers running tasks.
Task scheduler picks optimal task/work and instructs the chosen task worker to execute the job.
System Optimization
Speed Optimization
→ Parallelize video uploading from client
data:image/s3,"s3://crabby-images/47528/475284028c5ade22326bf0c4476da2b978e13022" alt=""
→ Parallelize video encoding : Introduction of Message Queue
data:image/s3,"s3://crabby-images/cbb96/cbb968cc21a02f387267785beb65b9fe9e4823d0" alt=""
Safety Optimization (Pre-signed upload URL & protect videos)
data:image/s3,"s3://crabby-images/ee70b/ee70b98bd66d4197576b295d51354458d7b1835b" alt=""
Pre-signed URL vs signed URL
A pre-signed URL gives you access to the object identified in the URL, provided that the creator of the pre-signed URL has permissions to access that object. That is, if you receive a pre-signed URL to upload an object, you can upload the object only if the creator of the pre-signed URL has the necessary permissions to upload that object (See page References & Glossary for design YouTube.
A signed URL includes additional information, for example, an expiration date and time, that gives you more control over access to your content. This additional information appears in a policy statement, which is based on either a canned policy or a custom policy.
Protect videos
Digital Rights Management Systems (DRM) Level Security: The metadata from the video file is received by the Content Decryption Module. The CDM creates a license request using the header metadata and is sent to the remote license server. Once the request is received, it returns a detailed license with the content keys. Then the CDM decrypts the content using the content keys. Once the decryption is done the video content is available to the user for playback. Encryption Media Extensions API securely handles these license request and the license information which are not accessible to the user. (PlayReady - Microsoft; Widevine - Google; FairPlay - Apple).
AES Encryption : AES-128 is the only publicly available encryption algorithm that is recommended by the NSA (National Security Agency) uses a known, external piece of information, called a key, to uniquely change the source data. This algorithm supports on-demand, live or DVR streaming. The encryption key which is needed to encrypt the videos is created using OpenSSL.
Visual watermarking : Image overlay on top of video that contains identifying info for the video (can be logo,…).
Cost-Saving Optimization
CDN ensures fast video delivery on a global scale.
Most popular videos are sent to CDN & other videos to video servers.
Short videos can be encoded on-demand only.
Build your CDN with an ISP (Internet Service Provider, ex. AT&T,…)
data:image/s3,"s3://crabby-images/8ed9b/8ed9b26a2b6d177b3ab3f9cb7ba46787f114a2e5" alt=""
Error handling (build a highly fault tolerance system)
Retry Pattern
Replicas
Master vs Slave Servers
Metadata Cache
API severs are stateless so requests will be directed to a different API server.
Step 4 - Wrap up
Scale the API tier : because API servers are stateless, it’s easy to scale horizontally.
Scale the DB : DB replications and Sharding.
Live streaming : requires uploading, encoding & streaming.
Videos takedowns : videos that violate copyrights,… shall be removes.