Saga Pattern & Distributed Transactions
Distributed Transaction কেন কঠিন?
Single database এ transaction সহজ — ACID guarantee আছে। কিন্তু microservices এ Order service, Payment service, Inventory service — তিনটা আলাদা database। একটা order place করতে তিনটাতেই write করতে হবে।
Payment succeed করলো, কিন্তু Inventory update করতে গিয়ে crash। এখন payment deducted কিন্তু product deliver হবে না। Inconsistent state!
⚠️ 2-Phase Commit (2PC) — পুরনো সমাধান কেন কাজ করে না
2PC একটা coordinator দিয়ে সব services কে lock করে। কিন্তু distributed system এ: coordinator fail করলেন সব services locked। High latency। Microservices এ impractical। এজন্যই Saga Pattern এর জন্ম।
DEFINITION
Saga Pattern হলো একটা sequence of local transactions। প্রতিটা local transaction তার নিজের database update করে এবং একটা event/message publish করে। কোনো step fail করলেন পূর্ববর্তী steps এর effects undo করার জন্য compensating transactions চালানো হয়।
Order Placement — ৩টা Service, ১টা Business Operation
একটা e-commerce order flow দেখুন। সব ঠিক চললে easy। কিন্তু মাঝপথে fail করলেন কী হয়?
ORDER SAGA — Happy Path vs Failure
✅ HAPPY PATH
❌ FAILURE PATH
Key Insight
Saga তে Eventual Consistency use করা হয়। মানে সব steps complete হওয়ার পর system consistent হবে। Immediate consistency নেই। এটা ACID এর alternative — distributed এ ACID possible না।
Choreography-based Saga
কোনো central coordinator নেই। প্রতিটা service নিজের কাজ করে এবং event publish করে। অন্য services সেই event listen করে নিজের কাজ করে। Decentralized approach।
CHOREOGRAPHY — Event-Driven, No Central Control
Order Service
1. Create order
2. Publish:
OrderPlaced →
Payment Service
Listens: OrderPlaced
Publish: PaymentDone →
Inventory Service
Listens: PaymentDone
Publish: StockReserved →
Notification Svc
Listens: multiple events
✓ Sends email/SMS
Choreography — Pros & Cons
✅ Pros
Simple — কোনো extra orchestrator নেই। Loose coupling। Services independent। Scale করা সহজ। Event-driven natural fit।
❌ Cons
Complex flow বোঝা কঠিন। Debugging কষ্টকর — event কোথায় গেল? Cyclic dependencies হতে পারে।
Orchestration-based Saga
একটা central Saga Orchestrator পুরো flow control করে। Orchestrator জানে কোন step কখন run করতে হবে এবং কোনো failure তে কোন compensation করতে হবে। Centralized approach।
ORCHESTRATION — Central Saga Orchestrator
Saga Orchestrator
Knows entire flow + compensations
State machine — tracks progress
Order Service
1. CreateOrder
Just executes commands
Payment Service
2. ChargePayment
Just executes commands
Inventory Service
3. ReserveInventory
Just executes commands
Notification Svc
4. SendNotification
Just executes commands
| Aspect | Choreography | Orchestration |
|---|---|---|
| Control | Decentralized (events) | Centralized (orchestrator) |
| Coupling | Loose | Moderate |
| Visibility | Hard to track | Easy to visualize |
| Debugging | Difficult | Centralized logs |
| Simple flows | Better | Overkill |
| Complex flows | Gets messy | Better |
| Tools | Kafka, RabbitMQ | Temporal, AWS Step Functions |
Failure হলে কীভাবে Rollback করবেন?
Traditional database rollback Saga তে কাজ করে না। কারণ প্রতিটা step আলাদা service এ committed হয়ে গেছে। তাই প্রতিটা step এর জন্য একটা compensating transaction define করতে হয়।
T1 — Order Service: Order Create করুন
Compensation: Order Cancel করুন (order status = CANCELLED)
T2 — Payment Service: Payment Charge করুন
Compensation: Refund করুন (full amount return)
T3 — Inventory Service: Stock Reserve করতে গিয়ে FAIL! ❌
Stock নেই! এখন backward compensation চলবে…
C2 — Payment Service: Refund চালাও
Payment reversed। Customer কে money ফেরত। Done ✓
C1 — Order Service: Order Cancel করুন
Order status = CANCELLED। Done ✓ System eventually consistent।
⚠️ Idempotency — Critical Requirement
Compensating transactions idempotent হতে হবে। মানে একই operation বারবার চালালেও same result। Network failure তে retry করলেন double refund হওয়া চলবে না। Unique transaction ID ব্যবহার করুন।
Practical Code
Python: Orchestration-based Saga State Machine
from enum import Enum
from dataclasses import dataclass, field
from typing import List
import uuid
class SagaState(Enum):
STARTED = "started"
ORDER_CREATED = "order_created"
PAYMENT_CHARGED = "payment_charged"
INVENTORY_RESERVED = "inventory_reserved"
COMPLETED = "completed"
# Compensating states
COMPENSATING = "compensating"
COMPENSATED = "compensated"
FAILED = "failed"
@dataclass
class OrderSaga:
saga_id: str = field(default_factory=lambda: str(uuid.uuid4()))
state: SagaState = SagaState.STARTED
order_id: str = None
payment_id: str = None
inventory_reserved: bool = False
compensations_done: List[str] = field(default_factory=list)
class OrderSagaOrchestrator:
"""Central orchestrator — entire saga flow এবং compensations manage করে"""
def execute(self, user_id: str, product_id: str, amount: float) -> dict:
saga = OrderSaga()
print(f"\n🚀 Starting Saga: {saga.saga_id}")
# Step 1: Create Order
try:
saga.order_id = self._create_order(user_id, product_id, amount)
saga.state = SagaState.ORDER_CREATED
print(f"✅ T1: Order created: {saga.order_id}")
except Exception as e:
saga.state = SagaState.FAILED
return {"status": "failed", "reason": f"Order creation failed: {e}"}
# Step 2: Charge Payment
try:
saga.payment_id = self._charge_payment(user_id, amount)
saga.state = SagaState.PAYMENT_CHARGED
print(f"✅ T2: Payment charged: {saga.payment_id}")
except Exception as e:
print(f"❌ T2 failed: {e} — Compensating T1...")
self._compensate_order(saga.order_id)
saga.compensations_done.append("order_cancelled")
saga.state = SagaState.COMPENSATED
return {"status": "failed", "reason": "Payment failed", "compensated": True}
# Step 3: Reserve Inventory
try:
self._reserve_inventory(product_id)
saga.inventory_reserved = True
saga.state = SagaState.INVENTORY_RESERVED
print(f"✅ T3: Inventory reserved for {product_id}")
except Exception as e:
print(f"❌ T3 failed: {e} — Compensating T2, T1...")
self._refund_payment(saga.payment_id, amount) # C2
self._compensate_order(saga.order_id) # C1
saga.compensations_done.extend(["payment_refunded", "order_cancelled"])
saga.state = SagaState.COMPENSATED
return {"status": "failed", "reason": "Out of stock",
"compensated": True, "refunded": amount}
saga.state = SagaState.COMPLETED
print(f"🎉 Saga COMPLETED: order={saga.order_id}")
return {"status": "success", "order_id": saga.order_id}
def _create_order(self, user_id, product_id, amount) -> str:
return f"order_{uuid.uuid4().hex[:8]}"
def _charge_payment(self, user_id, amount) -> str:
if amount > 10000: raise Exception("Insufficient funds")
return f"pay_{uuid.uuid4().hex[:8]}"
def _reserve_inventory(self, product_id):
if product_id == "OUT_OF_STOCK": raise Exception("No stock!")
def _compensate_order(self, order_id):
print(f" ↩️ C1: Order {order_id} cancelled")
def _refund_payment(self, payment_id, amount):
print(f" ↩️ C2: Payment {payment_id} refunded ৳{amount}")
# Test
orchestrator = OrderSagaOrchestrator()
print("=== Happy Path ===")
orchestrator.execute("user1", "product1", 500)
print("\n=== Out of Stock ===")
orchestrator.execute("user2", "OUT_OF_STOCK", 500)Node.js: Choreography with Event Bus
const EventEmitter = require('events');
const bus = new EventEmitter(); // In production: Kafka/RabbitMQ
// ===== ORDER SERVICE =====
bus.on('order.start', ({ orderId, userId, productId, amount }) => {
console.log(`📦 OrderService: Creating order ${orderId}`);
// Save to DB, then publish next event
bus.emit('order.created', { orderId, userId, productId, amount });
});
bus.on('order.cancel', ({ orderId, reason }) => {
console.log(`↩️ OrderService: Cancelling order ${orderId} — ${reason}`);
});
// ===== PAYMENT SERVICE =====
bus.on('order.created', ({ orderId, userId, amount }) => {
console.log(`💳 PaymentService: Charging ৳${amount} for ${orderId}`);
try {
if (amount > 10000) throw new Error('Insufficient funds');
const paymentId = `pay_${Date.now()}`;
bus.emit('payment.charged', { orderId, paymentId, amount });
} catch (err) {
// Compensation: cancel the order
bus.emit('order.cancel', { orderId, reason: err.message });
}
});
bus.on('payment.refund', ({ paymentId, amount }) => {
console.log(`↩️ PaymentService: Refunding ৳${amount} for ${paymentId}`);
});
// ===== INVENTORY SERVICE =====
bus.on('payment.charged', ({ orderId, paymentId, amount }) => {
console.log(`📦 InventoryService: Reserving stock for ${orderId}`);
try {
if (orderId.includes('nostock')) throw new Error('Out of stock');
bus.emit('inventory.reserved', { orderId });
} catch (err) {
// Compensations: refund + cancel order
bus.emit('payment.refund', { paymentId, amount });
bus.emit('order.cancel', { orderId, reason: err.message });
}
});
bus.on('inventory.reserved', ({ orderId }) => {
console.log(`🎉 SAGA COMPLETE: Order ${orderId} fulfilled!`);
});
// Test happy path
bus.emit('order.start', { orderId: 'ord-001', userId: 'u1', productId: 'p1', amount: 500 });
// Test failure path
bus.emit('order.start', { orderId: 'nostock-002', userId: 'u2', productId: 'p2', amount: 200 });Real World Use Cases & Tools
🛒 Amazon / Flipkart
Order placement saga: order → payment → inventory → shipping → notification। প্রতিটা step আলাদা service। Failure তে automatic compensation।
🚗 Uber
Ride request saga: request → driver matching → payment pre-auth → GPS tracking। Cancel করলেন compensation — pre-auth release।
✈️ Booking.com
Travel booking: flight + hotel + car rental। একটা fail করলেন অন্যগুলো compensate। Complex multi-service saga।
💳 Fintech Apps
Money transfer saga: debit source → credit destination → notify both। Failure তে reverse debit। Idempotency critical।
| Tool | Type | Best For |
|---|---|---|
| AWS Step Functions | Orchestration | AWS ecosystem saga orchestration |
| Temporal.io | Orchestration | Complex durable workflows, code-first |
| Apache Kafka | Choreography | Event-driven saga, high throughput |
| Axon Framework | Both | Java/Spring CQRS + Saga |
| Conductor (Netflix) | Orchestration | Microservices workflow engine |
Common Interview Questions
🎯 Q1: 2PC এবং Saga এর পার্থক্য কী?
2PC: Distributed ACID transaction। Coordinator সব services lock করে। Strong consistency কিন্তু blocking, failure prone।
Saga: Sequence of local transactions + compensations। Eventual consistency। Non-blocking। Microservices এ suitable।
🎯 Q2: Choreography vs Orchestration কখন কোনটা?
Choreography: Simple flows, ৩-৪টা steps, loose coupling priority। Event-driven system এ natural।
Orchestration: Complex flows, many steps, visibility দরকার, compensation logic complex। AWS Step Functions, Temporal।
🎯 Q3: Saga তে Idempotency কেন critical?
Network failure তে messages retry হয়। একই compensation দুইবার execute হতে পারে। Payment refund দুইবার হলে double refund। প্রতিটা operation এ unique ID track করুন — already processed হলে skip করুন।
🎯 Q4: Saga এর limitations কী?
1) No immediate consistency — eventual। 2) Compensating transactions complex লেখা। 3) Partial state visible to users (e.g., order created কিন্তু payment pending)। 4) Debugging difficult in choreography। এগুলো trade-offs।
SUMMARY — আজকে যা শিখলাম
| Concept | এক লাইনে |
|---|---|
| Saga Pattern | Local transactions + compensations = distributed transaction |
| Compensating TX | Failed step এর আগের steps undo করে |
| Choreography | Events দিয়ে decentralized flow — loose coupling |
| Orchestration | Central orchestrator — visibility ভালো, complex flow |
| Eventual Consistency | ACID না — সব steps complete হলে consistent |
| Idempotency | Same operation দুইবার — same result, no double effect |
| Temporal / Step Fn | Production saga orchestration tools |
Phase 3 — Distributed Systems Complete!
আপনি System Design Mastery Course এর Phase 3 সম্পূর্ণ করেছেনো। Distributed Systems এর ৬টি core topic এ solid foundation তৈরি হয়েছে।
PHASE 4 — Real-World Systems:
URL Shortener Design · Chat System Design · Video Streaming Architecture · Search Engine Design · Payment System Design · Social Media Feed