PushBackLog

Saga Pattern

Advisory enforcement Complete by PushBackLog team
Topic: architecture Topic: distributed-systems Skillset: backend Technology: generic Stage: planning Stage: execution

Saga Pattern

Status: Complete
Category: Architecture
Default enforcement: Advisory
Author: PushBackLog team


Tags

  • Topic: architecture, distributed-systems
  • Skillset: backend
  • Technology: generic
  • Stage: planning, execution

Summary

The Saga pattern manages multi-step business transactions that span multiple services or aggregates, replacing distributed ACID transactions with a sequence of local transactions coordinated by compensation logic. When a step fails, previously completed steps are undone by explicit compensating transactions rather than a database rollback. Sagas are the standard solution to the problem of distributed consistency in microservices architectures.


Rationale

Distributed transactions are a trap

In a monolith with a single database, multi-step business operations can be wrapped in a database transaction: all steps succeed together or all are rolled back. This is cheap to use and trivially correct. In a distributed system, no such primitive exists. A two-phase commit (2PC) across services or databases is technically possible but has severe consequences: it requires all participating systems to hold locks while the coordinator waits for acknowledgement, creating a latency penalty and a single point of failure that is worse in production than the problem it solves.

The practical answer to distributed consistency in modern systems is not 2PC. It is the Saga pattern.

Eventual consistency is the price of distribution

A saga accepts that a distributed transaction will not be immediately consistent — it will eventually become consistent as each step completes and as compensating transactions undo any steps that cannot be completed. This is not a degraded form of correctness. It is the correct model for systems that are physically distributed. The design task is to make the intermediate states safe and understandable to users, and to ensure compensating logic is correct and reliable.


Guidance

Two coordination strategies

Choreography-based sagas

Each service publishes events after completing its local transaction. Other services subscribe to those events and perform their own transactions in response. There is no central coordinator.

OrderService     → publishes: OrderPlaced
PaymentService   → listens: OrderPlaced  → charges card → publishes: PaymentProcessed
                                                        → on failure: publishes: PaymentFailed
FulfilmentService→ listens: PaymentProcessed → reserves stock → publishes: StockReserved
OrderService     → listens: PaymentFailed → publishes: OrderCancelled (compensation)

Pros: Simple — no extra coordinator service; services are loosely coupled
Cons: Hard to trace end-to-end flow; compensating logic is distributed; risk of cyclic event chains

Orchestration-based sagas

A central saga orchestrator sends commands to each participant and waits for responses. The orchestrator owns the state machine of the saga.

class PlaceOrderSaga {
  async execute(orderId: string): Promise<void> {
    try {
      await this.paymentService.charge(orderId);
      await this.inventoryService.reserve(orderId);
      await this.orderService.confirm(orderId);
    } catch (error) {
      await this.compensate(orderId, error);
    }
  }

  private async compensate(orderId: string, failedAt: Error): Promise<void> {
    await this.inventoryService.releaseReservation(orderId).catch(ignore);
    await this.paymentService.refund(orderId).catch(ignore);
    await this.orderService.fail(orderId, failedAt.message);
  }
}

Pros: End-to-end flow is visible in one place; easier to test and trace
Cons: Orchestrator can become a bottleneck; creates coupling to the orchestrator

Compensating transactions

A compensating transaction is a business-level inversion of a completed step. It does not “undo” in a technical sense — it creates a new transaction that brings the system to an equivalent state.

Original actionCompensating action
Charge customer paymentIssue refund
Reserve inventoryRelease reservation
Create orderCancel order
Send confirmation emailSend cancellation email

Some operations are not compensable by nature (a confirmation email has been read; a SMS has been delivered). Design for this by ensuring these steps occur last in the saga sequence, after all compensable steps have succeeded.

Design checklist

  • Every step has a defined compensating transaction
  • Compensating transactions are idempotent — safe to call multiple times
  • Saga state is persisted — a crashed orchestrator can resume from its last known position
  • Compensation failures are handled — compensating a compensation requires additional design
  • Steps are ordered to defer non-compensable actions to the end
  • Correlation IDs link all events and commands to the originating saga instance
  • Dead-letter queues catch and alert on undeliverable saga messages

When to use sagas

  • Multi-step business transactions that span service boundaries or multiple aggregates
  • Business processes where intermediate state is acceptable and understood by the product team
  • Long-running workflows (order fulfilment, user onboarding, document processing)

Do not use a saga where a single-service local transaction would suffice. Sagas add significant complexity and testing overhead.