PushBackLog

Immutable Infrastructure

Advisory enforcement Complete by PushBackLog team
Topic: infrastructure Topic: reliability Topic: deployment Skillset: devops Technology: Docker Technology: Kubernetes Technology: Terraform Technology: Packer Stage: delivery Stage: operations

Immutable Infrastructure

Status: Complete
Category: Infrastructure
Default enforcement: Advisory
Author: PushBackLog team


Tags

  • Topic: infrastructure, reliability, deployment
  • Skillset: devops
  • Technology: Docker, Kubernetes, Terraform, Packer
  • Stage: delivery, operations

Summary

Immutable infrastructure is a model in which servers and compute instances are never modified after deployment. Instead of patching, configuring, or updating running instances, a new instance is built with the desired change and the old one is replaced. This eliminates the “configuration drift” that accumulates over time when instances are patched in place, produces fully reproducible environments, and makes rollback as simple as deploying the previous image. The mantra is “cattle, not pets”: instances are interchangeable, replaceable, and never manually configured.


Rationale

Configuration drift makes debugging unreliable

In a traditional “mutable” model, servers are patched, updated, and configured over time. After months of operation, no two servers in the same cluster are identical — some have had hotfixes manually applied, some have newer library versions, some have custom configurations. When a bug surfaces in production, it may be impossible to reproduce because the “server state” that caused it is unique and ephemeral. Immutability eliminates this: every instance has a known, documented, version-controlled state.

Reproducibility eliminates “works on my machine”

When the production environment is built from a version-controlled image (Docker image, AMI, or VM template), the same image can be used for local development, CI, staging, and production. The “works on my machine” class of bugs disappears because all environments share the same artefact.

Rollback is just deploying the previous image

Rolling back a mutable server requires reversing applied patches, undoing configuration changes, and hoping that the reversal is complete and correct. Rolling back an immutable instance is deploying the previous image — a known-good artefact with a deterministic state.


Guidance

Cattle vs. pets

ConceptPetsCattle
IdentityNamed, unique (web-server-01)Generic, numbered
Failure responseNurse back to healthReplace with new instance
StateAccumulated over timeDefined at creation from an image
MutabilityMutated in placeNever; replaced to change

Containers and managed cloud instances (ECS, EKS, Lambda) are naturally cattle. EC2 instances configured via SSH and run scripts are pets.

Docker images as immutable artefacts

# Every change produces a new image with a new tag
FROM node:20-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

FROM node:20-alpine
WORKDIR /app
COPY --from=build /app/dist ./dist
COPY --from=build /app/node_modules ./node_modules

# Image is read-only; configuration via environment variables only
USER node
EXPOSE 8080
CMD ["node", "dist/server.js"]

Principles for immutable container images:

  • Never docker exec and modify a running container — make changes in the Dockerfile and rebuild
  • Tag images with commit SHA or version, not just latest — enables reliable rollback
  • No secrets baked into images — pass via environment variables or secrets manager at runtime

Golden AMIs with Packer

For EC2-based infrastructure, build a new AMI for every change rather than patching running instances:

# packer/api-server.pkr.hcl
source "amazon-ebs" "api" {
  ami_name      = "api-server-${formatdate("YYYYMMDD-hhmmss", timestamp())}"
  instance_type = "t3.small"
  source_ami    = data.amazon-ami.ubuntu.id

  tags = {
    Version = var.app_version
    Built   = timestamp()
  }
}

build {
  sources = ["source.amazon-ebs.api"]

  provisioner "shell" {
    scripts = [
      "packer/scripts/install-dependencies.sh",
      "packer/scripts/configure-app.sh",
    ]
  }
}

Terraform then replaces the Auto Scaling Group to use the new AMI.

No SSH in production

If instances can be SSH’d into and manually modified, someone will SSH in and manually modify them. Remove SSH access entirely. Instead:

  • Use Systems Manager Session Manager for break-glass access (no inbound port 22)
  • Use log aggregation and distributed tracing for debugging — don’t SSH to check logs
  • If a configuration change is needed, update the definition and redeploy

Blue/green and immutable infrastructure

Immutable infrastructure pairs naturally with blue/green deployments: each deployment produces a new set of instances/containers with a new image, and traffic is atomically switched.

Review checklist

  • No changes made to running containers or instances (no docker exec for config changes)
  • All configuration is externalized via environment variables, not baked into images
  • Docker images are tagged with version/SHA, not just latest
  • Deployments replace instances/containers rather than updating in place
  • No direct SSH access to production instances
  • All infrastructure provisioning is automated via IaC — no manual console configuration