Design a video sharing service like Youtube where users will be able to upload/view/search videos.
Requirements
Functional Requirement
Users should be able to upload videos
Users should be able to share/view videos
Users should be able to search videos
System should be able to record stats (likes,etc)
Users should be able to add/view comments
Non-Functional Requirement
Low latency : Users should be able to view videos without much lag
Consistency : System should provide same videos to users in all devices
Available : System must be highly avialable
Reliable : System should not lose data
Out of Scope
Video recommendation
Channel subscription
Watch later
Capacity Management
Users estimate
Assume total user : 1.5 Billion
Active users : 800 M (daily)
On average , user views 5 videos per day
Total video views per second = 800M * 5 / 86400 sec => 46K videos/sec
Upload estimate
Assume upload:view ratio 1:200
For every video upload , we have 200 video views
46K / 200 => 230 videos/sec
Storage estimate
Assume every minute 500 hours worth of videos are uploaded
On average, 1 min of video needs 50 Mb
Total storage = 500 hours * 60 min * 50MB => 1500 GB/min (25 GB/sec)
Not taking replication and compression into account
Bandwidth estimate
With 500 hours of video upload per min
Each video upload takes a bandwidth of 10 Mb/min
Total : 500 hours * 60 mins * 10MB => 300GB/min (5GB/sec)
Assuming an upload:view ratio of 1:200, we would need 1TB/s outgoing bandwidth
System API
We can have SOAP or REST APIs to expose the functionality of our service. The following could be
the definitions of the APIs for uploading and searching videos:
api_dev_key (string): The API developer key of a registered account. This will be used to, among other
things, throttle users based on their allocated quota.
video_title (string): Title of the video.
video_description (string): Optional description of the video.
tags (string[]): Optional tags for the video.
category_id (string): Category of the video, e.g., Film, Song, People, etc.
default_language (string): For example English, Mandarin, Hindi, etc.
recording_details (string): Location where the video was recorded.
video_contents (stream): Video to be uploaded.
Returns: (string)
A successful upload will return HTTP 202 (request accepted) and once the video encoding is completed
the user is notified through email with a link to access the video. We can also expose a queryable API
to let users know the current status of their uploaded video.
api_dev_key (string): The API developer key of a registered account of our service.
search_query (string): A string containing the search terms.
user_location (string): Optional location of the user performing the search.
maximum_videos_to_return (number): Maximum number of results returned in one request.
page_token (string): This token will specify a page in the result set that should be returned.
Returns: (JSON)
A JSON containing information about the list of video resources matching the search query. Each video
resource will have a video title, a thumbnail, a video creation date, and a view count.
api_dev_key (string): The API developer key of a registered account of our service.
video_id (string): A string to identify the video.
offset (number): We should be able to stream video from any offset; this offset would be a time in seconds from the beginning of the video. If we support playing/pausing a video from multiple devices. We will need to store the offset on the server. This will enable the users to start watching a video on any
device from the same point where they left off.
codec (string) & resolution(string): We should send the codec and resolution info in the API from the
client to support play/pause from multiple devices. Imagine you are watching a video on your TV’s
Netflix app, paused it, and started watching it on your phone’s Netflix app. In this case, you would need
codec and resolution, as both these devices have a different resolution and use a different codec.
Returns: (STREAM)
A media stream (a video chunk) from the given offset
High Level Design
Processing Queue: Each uploaded video will be pushed to a processing queue to be de-queued
later for encoding, thumbnail generation, and storage.
Encoder: To encode each uploaded video into multiple formats.
Thumbnails generator: To generate a few thumbnails for each video.
Video and Thumbnail storag: To store video and thumbnail files in some distributed file
storage.
User Database: To store user’s information, e.g., name, email, address, etc.
Video metadata storage: A metadata database to store all the information about videos like
title, file path in the system, uploading user, total views, likes, dislikes, etc. It will also be used
to store all the video comments.
Cache for hot users or influencers
Read more about cache from here
Apply 80-20 rule
It means 20% of daily read volume for video is generating 80% of traffic which means that certain videos (hot users/trending/influencers) are so popular that the majority of people view them. This dictates that we can try to cache 20% of daily view volume of videos and metadata.