System DesignMastery
--Real-World Systems — বাস্তব সিস্টেম ডিজাইন

Design YouTube/Netflix

Duration৯০-১২০ মিনিট
LevelAdvanced
FocusSystem Design Case
001Why This System

Video Streaming কেন আলাদা?

YouTube প্রতি মিনিটে 500 ঘণ্টার video upload হয়। Netflix-এর internet traffic-এর ১৫% শুধু Netflix-এর। এই দুটো system-এ সবচেয়ে বড় challenge হলো video storage, processing, এবং delivery — এটা regular web app থেকে সম্পূর্ণ আলাদা।

📌 Key Insight

Video streaming = massive storage + video transcoding + adaptive bitrate streaming + global CDN। একটা 1-hour video কে 10+ different formats/qualities-এ encode করতে হয়।

📤 Upload Challenge

Creator uploads → raw storage → Kafka trigger → transcoding workers → multiple quality versions → CDN এ push।

▶️ Stream Challenge

Viewer requests → API gets CDN URL → Viewer directly streams from nearest CDN edge node। API server video bits transmit করে না।

002Requirements

Features কী কী?

✅ Functional Requirements

  • Video upload করা
  • Video stream করা (watch)
  • Video search করা
  • Like, comment, subscribe
  • Video recommendations
  • Resume from where left off
  • Multiple quality (360p/720p/1080p/4K)

⚡ Non-Functional Requirements

  • 2 billion+ monthly active users
  • Video starts in < 2 seconds
  • No buffering (smooth playback)
  • 99.9% availability
  • Uploaded video available within 1 min
  • Support any device (mobile/TV/web)
003Back-of-Envelope Estimation

YouTube-এর Numbers

500hrsVideo uploaded/min
2BMonthly Active Users
1B hrsWatched daily
10xStorage multiplier (transcoding)
~500MBPer minute video (1080p)
1 EB+Total video storage

🔢 Storage Calculation

500 hrs/min × 60 min = 30,000 hrs/hr upload। প্রতি ঘণ্টা video ≈ 1GB (compressed) × 10 formats = 10GB/hr of video। Daily: 30,000 × 24 × 10GB = 7.2 petabytes/day! এই scale-এ dedicated storage infrastructure দরকার।

004High Level Architecture

YouTube-এর Two Flows

YouTube-এ দুটো completely আলাদা path: Upload path এবং Stream path

YouTube Architecture — Upload & Stream Flows

── UPLOAD FLOW ──CREATORUploadsUPLOADSERVICERAWBLOB STOREKAFKAQueueTRANSCODINGWORKERS360p/720p/1080p/4KPROCESSEDS3 StorageGLOBAL CDNCloudFront / Akamai (150+ Edge Locations)── STREAM FLOW ──VIEWERWatchesAPI SERVERGet CDN URLMETADATAMySQL + RedisDirect CDN stream
005Deep Dive

Video Processing Pipeline

🎬 Transcoding — সবচেয়ে গুরুত্বপূর্ণ Step

একটা raw video upload হলে সেটাকে অনেকগুলো format-এ convert করতে হয়। এটাকে transcoding বলে।

Output FormatResolutionBitrateUse Case
360p640×360~400 KbpsSlow mobile/2G
480p854×480~700 KbpsMobile 3G
720p HD1280×720~2.5 MbpsNormal streaming
1080p FHD1920×1080~5 MbpsGood connection
4K UHD3840×2160~20 MbpsNetflix/Fast internet

💡 Adaptive Bitrate Streaming (ABR)

Player automatically quality switch করে network speed অনুযায়ী। Netflix/YouTube HLS (HTTP Live Streaming) বা DASH protocol use করে। Video small chunks (2-10 sec) এ ভাগ হয়। Slow network → 360p, Fast → 1080p auto switch।

transcoding_worker.py
import subprocess
import boto3

def transcode_video(raw_video_path: str, video_id: str):
    qualities = [
        {"name": "360p",  "width": 640,  "height": 360,  "bitrate": "400k"},
        {"name": "720p",  "width": 1280, "height": 720,  "bitrate": "2500k"},
        {"name": "1080p", "width": 1920, "height": 1080, "bitrate": "5000k"},
    ]
    s3 = boto3.client('s3')
    output_files = []

    for q in qualities:
        output_path = f"/tmp/{video_id}_{q['name']}.mp4"
        # FFmpeg দিয়ে transcode করুন
        subprocess.run([
            "ffmpeg", "-i", raw_video_path,
            "-vf", f"scale={q['width']}:{q['height']}",
            "-b:v", q['bitrate'],
            "-c:v", "libx264", "-c:a", "aac",
            output_path
        ])
        # S3-এ upload করুন
        s3_key = f"videos/{video_id}/{q['name']}.mp4"
        s3.upload_file(output_path, "my-video-bucket", s3_key)
        output_files.append(s3_key)

    return output_files  # CDN-এ push করা হবে

📊 Video Chunking — HLS Protocol

master.m3u8 (Playlist file)
# Master playlist — player এটা দেখে quality choose করে
#EXTM3U
#EXT-X-VERSION:3

# 360p stream
#EXT-X-STREAM-INF:BANDWIDTH=400000,RESOLUTION=640x360
https://cdn.youtube.com/video123/360p/playlist.m3u8

# 720p stream
#EXT-X-STREAM-INF:BANDWIDTH=2500000,RESOLUTION=1280x720
https://cdn.youtube.com/video123/720p/playlist.m3u8

# 1080p stream
#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
https://cdn.youtube.com/video123/1080p/playlist.m3u8

# Player automatically best quality select করে bandwidth অনুযায়ী
006ABR & CDN Strategy

Adaptive Bitrate & CDN — Global Delivery

YouTube-এর streaming architecture-এর দুটো secret weapon: ABR (buffer-free playback) এবং CDN (global fast delivery)।

ABR Streaming — How HLS/DASH Works

Video File
Segment into 4s Chunks
m3u8 Playlist
CDN Edge
Player Auto-selects Quality

🐌 Slow Network (≤ 1 Mbps)

Player switches to 360p। No buffering। Small chunk size।

🚶 Normal Network (5 Mbps)

Player uses 720p। Balanced quality/buffer।

🚀 Fast Network (20+ Mbps)

Player serves 1080p / 4K। Full quality।

ProtocolDeveloperChunk SizeSupported ByBest For
HLSApple2-10 seciOS, Safari, allVOD + Live streaming
DASHMPEG (ISO)2-4 secAndroid, ChromeAdaptive VOD
RTMPAdobeReal-timeFlash (legacy)Live ingest only
WebRTCW3C/IETFSub-secondAll browsersUltra-low latency

💡 Netflix Open Connect

Netflix Open Connect Appliance (OCA) — Netflix নিজেই hardware box তৈরি করে ISP-দের কাছে বিনামূল্যে দেয়। ISP network-এর ভেতরে Netflix content cache থাকে। Bandwidth cost কমে, speed বাড়ে। Bangladesh-এর user India বা Singapore edge থেকে video পাবেন — origin US server থেকে না। Latency 200ms থেকে 10ms-এ নামে।

007Database Design

Data কোথায় রাখবো?

DataDatabaseWhy?
Video metadata (title, desc)MySQL + Redis cacheStructured, relational, cacheable
Raw video filesAmazon S3 / Google Cloud StorageObject storage, terabytes-scale
Processed videosS3 → CDN (CloudFront)Edge delivery globally
User watch historyCassandraTime-series, massive scale
Video search indexElasticsearchFull-text search, tags
Recommendations dataNeo4j / ML model storeGraph-based recommendations
View counts, likesRedis (counters)Fast atomic increments

⚠️ Never Store Videos in DB

Database-এ video binary কখনো store করবেন না। Database slow binary retrieval-এর জন্য। সর্বদা Object Storage (S3) + CDN ব্যবহার করুন। DB শুধু metadata রাখে।

008Scaling Decisions

Scale করার উপায়

Strategy

CDN is Everything: 95%+ video traffic CDN থেকে serve হয়। Origin server-এ কোনো load নেই। Netflix Open Connect — নিজেদের CDN boxes ISP-র কাছে রাখে।

Strategy

Parallel Transcoding: একটা video কে segments-এ ভাগ করে parallel workers-এ transcode করুন। 1-hour video → 1 worker 1 hour লাগবে, 60 workers = 1 minute!

Strategy

Pre-warm CDN: Popular video detect হলে immediately CDN edge locations-এ push করুন। Viral হওয়ার আগেই ready।

Trade-off

Storage Cost: 10x storage multiplication (multiple qualities)। Storage cheap কিন্তু at 1 exabyte scale expensive হয়। Cold storage (Glacier) use করুন old videos-এর জন্য।

Trade-off

Transcoding Time: 4K video transcode করতে সময় লাগে। User upload করার সাথে সাথে সব quality available হয় না — lower quality আগে, higher quality পরে।

009Full Tech Stack

YouTube/Netflix-এর Technologies

Backend & Processing

Python / JavaFFmpeg (Transcoding)KubernetesApache Kafka

Storage

Amazon S3 (Video Files)MySQL (Metadata)Redis (Cache + Counters)Cassandra (Watch History)Elasticsearch (Search)

Delivery & Monitoring

CloudFront CDNHLS / MPEG-DASHPrometheus + GrafanaApache Spark (Recommendations)
010Recommendations & Interview Tips

Recommendations & Interview Tips

🤖 Recommendation System

YouTube recommendation হলো ML-based। User watch history (Cassandra) + likes + search queries → collaborative filtering। Graph DB (Neo4j) দিয়ে similar users find করা হয়। Apache Spark large-scale batch processing করে।

✅ Interview Tips

  • Upload path এবং stream path আলাদা করুন
  • CDN-first strategy mention করুন
  • Kafka দিয়ে async transcoding বলুন
  • Never DB for video binary — S3 বলুন
  • ABR / HLS explain করতে পারেন

🎯 Interview Must-Know: Viral Video Handling

Viral video (suddenly 10M views in 1 hour) — CDN automatically handle করে। Viral video CDN edge servers globally cached। 10M concurrent viewers CDN থেকে serve হয়, origin server-এ almost zero load। এজন্যই YouTube viral videos-এও crash করে না।

011Lesson Summary

SUMMARY — আজকে যা শিখলাম

ChallengeSolutionTechnology
Video storage at scaleObject storageAmazon S3
Multiple quality formatsTranscoding pipelineFFmpeg + Workers
Global fast deliveryEdge CDNCloudFront/Akamai
Smooth streamingAdaptive bitrateHLS / DASH
Async processingMessage queueApache Kafka
Watch historyTime-series DBCassandra
HLS (HTTP Live Streaming)2-10s chunks, m3u8 playlist, ABRApple / Industry standard
DASH (MPEG-DASH)2-4s segments, ISO standard, codec-agnosticMPEG / Android / Chrome
RTMPReal-time ingest for live streamsAdobe (legacy ingest)
Netflix Open ConnectISP-level CDN appliancesNetflix proprietary CDN
012Knowledge Check
013Assignments
014Practical Lab