Saga Pattern

Status: Complete
Category: Architecture
Default enforcement: Advisory
Author: PushBackLog team

Summary

The Saga pattern manages multi-step business transactions that span multiple services or aggregates, replacing distributed ACID transactions with a sequence of local transactions coordinated by compensation logic. When a step fails, previously completed steps are undone by explicit compensating transactions rather than a database rollback. Sagas are the standard solution to the problem of distributed consistency in microservices architectures.

Rationale

Distributed transactions are a trap

In a monolith with a single database, multi-step business operations can be wrapped in a database transaction: all steps succeed together or all are rolled back. This is cheap to use and trivially correct. In a distributed system, no such primitive exists. A two-phase commit (2PC) across services or databases is technically possible but has severe consequences: it requires all participating systems to hold locks while the coordinator waits for acknowledgement, creating a latency penalty and a single point of failure that is worse in production than the problem it solves.

The practical answer to distributed consistency in modern systems is not 2PC. It is the Saga pattern.

Eventual consistency is the price of distribution

A saga accepts that a distributed transaction will not be immediately consistent — it will eventually become consistent as each step completes and as compensating transactions undo any steps that cannot be completed. This is not a degraded form of correctness. It is the correct model for systems that are physically distributed. The design task is to make the intermediate states safe and understandable to users, and to ensure compensating logic is correct and reliable.

Guidance

Two coordination strategies

Choreography-based sagas

Each service publishes events after completing its local transaction. Other services subscribe to those events and perform their own transactions in response. There is no central coordinator.

OrderService     → publishes: OrderPlaced
PaymentService   → listens: OrderPlaced  → charges card → publishes: PaymentProcessed
                                                        → on failure: publishes: PaymentFailed
FulfilmentService→ listens: PaymentProcessed → reserves stock → publishes: StockReserved
OrderService     → listens: PaymentFailed → publishes: OrderCancelled (compensation)

Pros: Simple — no extra coordinator service; services are loosely coupled
Cons: Hard to trace end-to-end flow; compensating logic is distributed; risk of cyclic event chains

Orchestration-based sagas

A central saga orchestrator sends commands to each participant and waits for responses. The orchestrator owns the state machine of the saga.

class PlaceOrderSaga {
  async execute(orderId: string): Promise<void> {
    try {
      await this.paymentService.charge(orderId);
      await this.inventoryService.reserve(orderId);
      await this.orderService.confirm(orderId);
    } catch (error) {
      await this.compensate(orderId, error);
    }
  }

  private async compensate(orderId: string, failedAt: Error): Promise<void> {
    await this.inventoryService.releaseReservation(orderId).catch(ignore);
    await this.paymentService.refund(orderId).catch(ignore);
    await this.orderService.fail(orderId, failedAt.message);
  }
}

Pros: End-to-end flow is visible in one place; easier to test and trace
Cons: Orchestrator can become a bottleneck; creates coupling to the orchestrator

Compensating transactions

A compensating transaction is a business-level inversion of a completed step. It does not “undo” in a technical sense — it creates a new transaction that brings the system to an equivalent state.

Original action	Compensating action
Charge customer payment	Issue refund
Reserve inventory	Release reservation
Create order	Cancel order
Send confirmation email	Send cancellation email

Some operations are not compensable by nature (a confirmation email has been read; a SMS has been delivered). Design for this by ensuring these steps occur last in the saga sequence, after all compensable steps have succeeded.

Design checklist

Every step has a defined compensating transaction
Compensating transactions are idempotent — safe to call multiple times
Saga state is persisted — a crashed orchestrator can resume from its last known position
Compensation failures are handled — compensating a compensation requires additional design
Steps are ordered to defer non-compensable actions to the end
Correlation IDs link all events and commands to the originating saga instance
Dead-letter queues catch and alert on undeliverable saga messages

When to use sagas

Multi-step business transactions that span service boundaries or multiple aggregates
Business processes where intermediate state is acceptable and understood by the product team
Long-running workflows (order fulfilment, user onboarding, document processing)

Do not use a saga where a single-service local transaction would suffice. Sagas add significant complexity and testing overhead.

Saga Pattern

Saga Pattern

Tags

Summary

Rationale

Distributed transactions are a trap

Eventual consistency is the price of distribution

Guidance

Two coordination strategies

Choreography-based sagas

Orchestration-based sagas

Compensating transactions

Design checklist

When to use sagas