System DesignMastery
--Real-World Systems — বাস্তব সিস্টেম ডিজাইন

Design Twitter/X

Duration৯০-১২০ মিনিট
LevelAdvanced
FocusSystem Design Case
001Why This System

Twitter কেন Design শেখা দরকার?

Twitter/X হলো সবচেয়ে complex social media system design interview question গুলোর একটা। এখানে News Feed generation, follower/following graph, real-time trending — এই তিনটা hard problem একসাথে solve করতে হয়।

📌 Core Challenges

Fanout Problem: একজন celebrity যখন tweet করে (Elon Musk: 150M followers), তখন সেই tweet 150M user-এর feed-এ push করতে হবে। এটাই Twitter-এর সবচেয়ে কঠিন scaling problem।

TWITTER FANOUT ARCHITECTURE — Overview

CLIENTWeb/AppAPIGATEWAYAuth + Rate LimitTWEETSERVICEUSERSERVICETIMELINESERVICEFANOUTSERVICEKAFKAQUEUETWEET DBCassandraTIMELINERedis CacheGRAPH DBFollow graphMEDIAS3 + CDNSEARCHElasticsearch
002Requirements

কী কী Features লাগবে?

✅ Functional Requirements

  • Tweet post করা (text, image, video)
  • Follow / Unfollow করা
  • Home timeline (following-দের tweets)
  • User timeline (নির্দিষ্ট user-এর tweets)
  • Like, Retweet, Reply
  • Search tweets ও users
  • Trending hashtags
  • Notification system

⚡ Non-Functional Requirements

  • 500M+ daily active users
  • Timeline load < 200ms
  • Tweet post < 500ms
  • High availability (99.99%)
  • Eventual consistency acceptable
  • Tweets never lost (durability)
003Capacity Estimation

Twitter Scale বোঝা

500MDaily Active Users
200MTweets/Day
2300Tweets/sec
300KTimeline reads/sec
150MMax Followers (celebrity)
100TBDaily Media Storage

🔢 Fanout Calculation

প্রতিটা tweet average 200 followers → fanout = 200 × 2300 = 460K writes/sec শুধু timeline update-এর জন্য। Celebrity tweet হলে 150M × 1 = 150M instant writes!এটাই “thundering herd” problem।

004High-Level Architecture

Twitter-এর Big Picture Components

Twitter একটা microservices architecture follow করে। প্রতিটা service আলাদা responsibility নিয়ে কাজ করে এবং একে অপরের সাথে communicate করে।

STEP 01[object Object]

API Gateway — Entry Point

Authentication, rate limiting, request routing। সব client requests এখান দিয়ে আসে। Nginx + Envoy Proxy use হয়।

STEP 02[object Object]

Tweet Service — Content Storage

Tweet create, read, delete। Cassandra-তে immutable tweet data store করে। Media S3-এ upload করে।

STEP 03[object Object]

Fanout Service — Distribution Engine

নতুন tweet হলে follower-দের timeline update করে। Kafka queue থেকে consume করে async fanout।

STEP 04[object Object]

Timeline Service — Feed Generation

User-এর home timeline serve করে। Redis cache থেকে pre-built timeline return করে। Celebrity tweets merge করে।

STEP 05[object Object]

User Service — Social Graph

Follow/unfollow manage করে। Follower list, following list। Redis Sorted Set এ graph store।

005Deep Dive — Fanout

Fanout Problem — Twitter-এর Hardest Part

যখন কেউ tweet করে, সেই tweet তার সব followers-এর home timeline-এ দেখা যায়। এটা করার দুটো approach আছে।

Approachকীভাবে কাজ করেWrite CostRead CostProblem
Fan-out on Write (Push)Tweet করলেনই সব followers-এর cache updateHigh (150M writes)Low (cache hit)Celebrity problem
Fan-out on Read (Pull)Timeline load করলেন following-দের tweets pullLowHigh (DB joins)Slow timeline
Hybrid (Twitter's approach)Normal users: push, Celebrity: pull on readBalancedBalancedBest of both

💡 Twitter-এর Real Solution

Twitter threshold ≈ 10,000 followers। এর কম followers হলে fan-out on write (push to all)। বেশি হলে fan-out on read (pull at read time)। এই hybrid approach-ই production-এ use হয়।

FANOUT DECISION FLOW

New Tweet Posted
followers < 10,000?
YES ✓
Fan-out on Write
Push to all
Redis caches
NO (celebrity)
Kafka Queue
Pull at read
time only
Timeline: merge regular + celebrity tweets
fanout_service.py
async def fanout_tweet(tweet_id, author_id, db, cache, queue):
    # Author-এর follower list নাও
    followers = await db.get_followers(author_id)

    # Hybrid Fanout Logic
    if len(followers) < 10_000:
        # Normal user: push to all followers' timeline cache
        for follower_id in followers:
            cache.lpush(f"timeline:{follower_id}", tweet_id)
            cache.ltrim(f"timeline:{follower_id}", 0, 799)  # last 800 tweets
    else:
        # Celebrity: push to Kafka, workers will process async
        # At read time, celebrity tweets are fetched separately
        await queue.publish("celebrity-fanout", {
            "tweet_id": tweet_id,
            "author_id": author_id
        })

    # Tweet always saved to Tweet DB
    await db.save_tweet(tweet_id, author_id)

📰 Home Timeline — Read Path

timeline_service.py
async def get_home_timeline(user_id, db, cache):
    # Step 1: User-এর pre-built timeline cache থেকে নাও
    cached = cache.lrange(f"timeline:{user_id}", 0, 19)

    # Step 2: Celebrity following list check
    celebrities = cache.smembers(f"following_celebrities:{user_id}")

    # Step 3: Celebrity tweets আলাদাভাবে fetch করুন
    celebrity_tweets = []
    for celeb_id in celebrities:
        tweets = await db.get_recent_tweets(celeb_id, limit=5)
        celebrity_tweets.extend(tweets)

    # Step 4: Merge করুন, sort by timestamp
    all_tweet_ids = cached + [t.id for t in celebrity_tweets]
    all_tweet_ids.sort(key=lambda x: x.created_at, reverse=True)

    return all_tweet_ids[:20]  # Top 20 tweets
006Celebrity Problem

Celebrity Problem ও Hybrid Model

Elon Musk (150M followers) যখন একটা tweet করেন, শুধু সেই একটা tweet-এর জন্য 150 million cache writes দরকার হয়। এটা milliseconds-এ করা impossible। Twitter এই problem solve করেছেনে hybrid approach দিয়ে।

⚠️ Thundering Herd Problem

একজন celebrity tweet করলেন লক্ষ লক্ষ users একসাথে timeline refresh করে। প্রতিটা request database-এ hit করলেন server crash হবে। Solution: Celebrity tweets কে আলাদা করুন — read time-এ pull করুন, write time-এ push করুন না।

HYBRID MODEL — Normal User vs Celebrity

Normal User (<10K followers)

1. Tweet post হয়

2. Fanout service সব followers-এর Redis List-এ tweet_id push করে

3. Timeline load = Redis cache থেকে instant read

✓ Write: High, Read: O(1) fast

Celebrity (>10K followers)

1. Tweet post হয় → Kafka-তে push

2. Cache update হয় না (150M skipped)

3. Timeline load-এ celebrity tweets আলাদাভাবে DB থেকে pull করে merge

✓ Write: Low, Read: pull+merge

⭐ Interview-এ যা বলবে

“Twitter hybrid fanout use করে। 10K follower threshold-এর নিচে fan-out on write, উপরে fan-out on read। Timeline-এ দুটো merge করা হয়। এই approach-এ write overhead কমে এবং read latency acceptable থাকে। Trade-off হলো celebrity tweets-এ slight delay হতে পারে।”

007Database Choices

Multiple Databases — কোনটা কিসের জন্য?

Twitter-এ single database দিয়ে হয় না। Different use cases-এর জন্য different databases।

Data TypeDatabaseWhy?
Tweets (immutable)CassandraAppend-only, time-series, massive scale
User profilesPostgreSQLACID, relational, profile updates
Follow graphRedis (sorted set) / Neo4jGraph traversal, follower counts
Home timelineRedis (list)Fast reads, pre-computed, in-memory
Search indexElasticsearchFull-text search, hashtag search
Media filesAmazon S3 + CloudFrontObject storage, CDN delivery
AnalyticsHadoop/Spark + HiveBatch processing, trend analysis

⚠️ Tweet Schema Design — Cassandra

Tweets-এ created_at দিয়ে partition করুন Cassandra-তে। Time-series data — নতুন tweets সবসময় write, পুরনো tweets rarely read। Cassandra-র time-based partitioning perfect fit।

cassandra_schema.cql
-- Tweet Table — Cassandra Schema
CREATE TABLE tweets (
    tweet_id    TIMEUUID,
    user_id     UUID,
    content     TEXT,
    media_url   TEXT,
    like_count  COUNTER,
    created_at  TIMESTAMP,
    PRIMARY KEY ((user_id), created_at, tweet_id)
) WITH CLUSTERING ORDER BY (created_at DESC);

-- Follow Graph — Redis Sorted Set
-- Key: "followers:{user_id}" → Value: follower_id, Score: follow_timestamp
ZADD followers:user123 1700000000 "follower_id_456"

-- Timeline Cache — Redis List
-- Key: "timeline:{user_id}" → Values: tweet_ids (most recent first)
LPUSH timeline:user123 "tweet_id_789"
LTRIM timeline:user123 0 799   -- Keep last 800 tweets only
008Scaling + Real-time + Interview Tips

Scaling Strategies, Real-time Notifications ও Interview Tips

Scaling Decisions

Strategy

Timeline Caching: প্রতিটা user-এর last 800 tweets Redis-এ store। 99% reads cache hit → DB আসে না।

Strategy

Async Fanout via Kafka: Tweet POST করলেন immediately return। Background-এ Kafka workers fanout করে। User experience fast রাখে।

Strategy

Read Replicas: Tweet DB-তে 1 master + 5 replicas। সব reads replica থেকে → master শুধু writes।

Trade-off

Eventual Consistency: Celebrity tweet হলে সবার timeline-এ 1-2 sec delay হতে পারে। Acceptable trade-off।

Trade-off

Storage Cost: প্রতিটা tweet 800 user-এর cache-এ store মানে 800x storage। Memory expensive কিন্তু latency দিয়ে pay করা worth।

Real-time Notifications — WebSocket

🔔 Real-time Notification System

Twitter real-time notifications-এর জন্য WebSocket use করে। User app open করলেন server-এর সাথে persistent connection maintain হয়।

কেউ follow করলেন, like করলেন, reply করলেন — Notification Service Kafka থেকে event consume করে WebSocket দিয়ে push করে।

Trending Topics: Redis Sorted Set-এ hashtag counts রাখুন। ZINCRBY দিয়ে increment করুন। ZREVRANGE দিয়ে top 10 trending পান। Sliding window (last 1 hour) use করুন।

trending_service.py
import redis
import time

r = redis.Redis()

def record_hashtag(hashtag: str):
    """Tweet post হলে hashtag count increment করুন"""
    current_hour = int(time.time() // 3600)  # sliding window key
    key = f"trending:{current_hour}"

    # Atomic increment
    r.zincrby(key, 1, hashtag)

    # Expire after 2 hours (sliding window)
    r.expire(key, 7200)

def get_trending(top_n: int = 10) -> list:
    """Last 1 hour-এর top trending hashtags"""
    current_hour = int(time.time() // 3600)
    prev_hour = current_hour - 1

    # Merge last 2 windows for sliding effect
    r.zunionstore(
        "trending:current",
        [f"trending:{current_hour}", f"trending:{prev_hour}"]
    )

    # Top N descending by score
    return r.zrevrange("trending:current", 0, top_n - 1, withscores=True)

def like_tweet_atomic(tweet_id: str):
    """Race condition ছাড়া like count increment"""
    # Redis INCR is atomic — no race condition!
    r.incr(f"likes:{tweet_id}")
    return r.get(f"likes:{tweet_id}")

Twitter-এর Real Tech Stack

Backend Services

Scala / JavaPython (ML/Recommendations)KubernetesNginx + Envoy Proxy

Data Storage

Cassandra (Tweets)PostgreSQL (Users)Redis Cluster (Timeline)Elasticsearch (Search)Manhattan (Twitter's own KV store)

Messaging & Analytics

Apache Kafka (Fanout)Amazon S3 (Media)CloudFront CDNHadoop + Spark (Analytics)

🎯 Interview Tips — Twitter Design-এ যা মনে রাখবে

1) সবসময় fanout problem দিয়ে শুরু করুন — interviewer এটা expect করে।

2)“Why not fan-out on write for everyone?” — celebrity problem explain করুন।

3) Database choices justify করুন: Cassandra = immutable time-series, Redis = speed, Elasticsearch = search।

4) Eventual consistency explicitly mention করুন — Twitter-এ strong consistency দরকার নেই।

5) Kafka = async decoupling। Tweet POST fast return → background fanout।

009Lesson Summary

SUMMARY — আজকে যা শিখলাম

ChallengeSolutionTechnology
Celebrity fanoutHybrid: push+pullKafka + Redis
Fast timeline readsPre-computed cacheRedis List
Tweet storage at scaleTime-series immutableCassandra
Hashtag searchInverted indexElasticsearch
Media deliveryObject store + CDNS3 + CloudFront
Follow graphGraph traversalRedis Sorted Set
Fanout on WritePush tweet_id to each follower cache on writeRedis LPUSH + LTRIM
Fanout on ReadPull following tweets at timeline load timeCassandra read + merge
010Knowledge Check
011Assignments
012Practical Lab